1 00:00:03,510 --> 00:00:02,310 thank you for staying with us 2 00:00:06,710 --> 00:00:03,520 we're going to be talking about the 3 00:00:07,990 --> 00:00:06,720 evolution of early proteins from amino 4 00:00:09,910 --> 00:00:08,000 acids 5 00:00:11,430 --> 00:00:09,920 and our first speaker 6 00:00:13,270 --> 00:00:11,440 is joanna 7 00:00:15,430 --> 00:00:13,280 i apologize if i mispronounce your name 8 00:00:19,029 --> 00:00:15,440 is it maisel 9 00:00:21,029 --> 00:00:19,039 who is joining us remotely from arizona 10 00:00:22,950 --> 00:00:21,039 um she is a professor at the university 11 00:00:24,710 --> 00:00:22,960 of arizona and she'll be talking to us 12 00:00:30,390 --> 00:00:24,720 about long-term evolution of the 13 00:00:35,030 --> 00:00:33,030 thank you for having me um and thank you 14 00:00:37,590 --> 00:00:35,040 everyone for who's there staying until 15 00:00:40,389 --> 00:00:37,600 the until the last session so i have a 16 00:00:42,470 --> 00:00:40,399 simpler title here um and basically even 17 00:00:44,389 --> 00:00:42,480 in the program what i uh 18 00:00:46,470 --> 00:00:44,399 what we're trying to do is figure out 19 00:00:48,709 --> 00:00:46,480 how the proteome evolves and that way we 20 00:00:51,350 --> 00:00:48,719 can project back and ask what the early 21 00:00:53,110 --> 00:00:51,360 proteome was like so this is a top-down 22 00:00:56,150 --> 00:00:53,120 approach of what we can deduce from 23 00:00:58,389 --> 00:00:56,160 modern proteins and in particular which 24 00:01:00,630 --> 00:00:58,399 amino acids are used and why they're 25 00:01:03,189 --> 00:01:00,640 used and sort of questions about you 26 00:01:06,149 --> 00:01:03,199 know that whether function drives this 27 00:01:08,469 --> 00:01:06,159 or availability drives this 28 00:01:10,789 --> 00:01:08,479 um so before you know we can really 29 00:01:13,510 --> 00:01:10,799 project back it's like how far back are 30 00:01:15,510 --> 00:01:13,520 we projecting um what is the origin of a 31 00:01:16,630 --> 00:01:15,520 given protein-coding gene that we look 32 00:01:18,390 --> 00:01:16,640 at 33 00:01:20,230 --> 00:01:18,400 and the traditional answer of how people 34 00:01:22,870 --> 00:01:20,240 thought about that was that if you look 35 00:01:25,109 --> 00:01:22,880 at some gene then it's diverged from 36 00:01:26,230 --> 00:01:25,119 some duplicate of some other gene but 37 00:01:27,910 --> 00:01:26,240 then you need to know where does that 38 00:01:29,990 --> 00:01:27,920 other gene come from well it must have 39 00:01:31,429 --> 00:01:30,000 diverged from after duplicating from 40 00:01:34,550 --> 00:01:31,439 some other gene 41 00:01:37,270 --> 00:01:34,560 and so the the view was one of some big 42 00:01:39,910 --> 00:01:37,280 ancient big bang of all genes in the 43 00:01:41,910 --> 00:01:39,920 distance past you know that that came 44 00:01:44,950 --> 00:01:41,920 from some primordial ancestor and it's 45 00:01:47,190 --> 00:01:44,960 the same genes sorting out ever since 46 00:01:49,510 --> 00:01:47,200 but that is that that view has recently 47 00:01:51,590 --> 00:01:49,520 been overturned and what we we now have 48 00:01:53,670 --> 00:01:51,600 sort of incontrovertible evidence in 49 00:01:55,510 --> 00:01:53,680 favor of is that at some rate and 50 00:01:58,389 --> 00:01:55,520 dispute what the rate is but at some 51 00:02:01,350 --> 00:01:58,399 rate there is continuous creation that 52 00:02:03,590 --> 00:02:01,360 basically de novo genes um come out of 53 00:02:06,550 --> 00:02:03,600 non-coded or frame shifting dna and they 54 00:02:08,229 --> 00:02:06,560 have no coding ancestor previously so 55 00:02:11,830 --> 00:02:08,239 this is very different to species that 56 00:02:14,229 --> 00:02:11,840 all go back to some sort of luca um 57 00:02:16,710 --> 00:02:14,239 genes all have separate origins 58 00:02:19,750 --> 00:02:16,720 throughout the history of life 59 00:02:21,750 --> 00:02:19,760 so what we then do is we can classify 60 00:02:24,390 --> 00:02:21,760 and a better thing to classify than 61 00:02:26,390 --> 00:02:24,400 genes turns out to be classified protein 62 00:02:28,790 --> 00:02:26,400 domains genes are sort of modular 63 00:02:30,790 --> 00:02:28,800 assortments of different genes 64 00:02:32,710 --> 00:02:30,800 domains that might have different ages 65 00:02:35,110 --> 00:02:32,720 so we classify each 66 00:02:36,949 --> 00:02:35,120 domain in the pfam database according to 67 00:02:38,949 --> 00:02:36,959 when it was born which we can figure out 68 00:02:41,030 --> 00:02:38,959 by when it has homologs 69 00:02:42,710 --> 00:02:41,040 and again we're focusing on homologs not 70 00:02:45,190 --> 00:02:42,720 orthologs a lot of people focus on 71 00:02:48,229 --> 00:02:45,200 orthologs because they're trying to 72 00:02:50,630 --> 00:02:48,239 um deduce function orthologs of the idea 73 00:02:53,830 --> 00:02:50,640 that it's somehow the same gene rather 74 00:02:56,150 --> 00:02:53,840 than some paralog which is also related 75 00:02:57,750 --> 00:02:56,160 but a different gene and that's not 76 00:02:59,430 --> 00:02:57,760 that's not an evolutionary rigorous 77 00:03:01,030 --> 00:02:59,440 distinction but whether or not they're 78 00:03:04,470 --> 00:03:01,040 related to each other by descent with 79 00:03:05,350 --> 00:03:04,480 modification is so include all homologs 80 00:03:08,630 --> 00:03:05,360 um 81 00:03:11,350 --> 00:03:08,640 and then we look at trends as a function 82 00:03:13,350 --> 00:03:11,360 of how long they've had to evolve 83 00:03:15,110 --> 00:03:13,360 and i originally started out very 84 00:03:16,949 --> 00:03:15,120 interested in things like aggregation 85 00:03:18,710 --> 00:03:16,959 propensity and intrinsic structural 86 00:03:20,869 --> 00:03:18,720 disorder and i've become a bit 87 00:03:22,550 --> 00:03:20,879 disenchanted with that over time because 88 00:03:24,949 --> 00:03:22,560 what we found is all the lovely 89 00:03:27,589 --> 00:03:24,959 predictors that take sequences and tell 90 00:03:29,350 --> 00:03:27,599 you what they do it turns out that they 91 00:03:31,270 --> 00:03:29,360 tell you something almost identical if 92 00:03:32,070 --> 00:03:31,280 you take the amino acids and you 93 00:03:34,550 --> 00:03:32,080 ins 94 00:03:36,470 --> 00:03:34,560 and you feed them in random order so the 95 00:03:38,309 --> 00:03:36,480 main predictors tend to be the 96 00:03:41,830 --> 00:03:38,319 frequencies of each of the 20 amino 97 00:03:43,589 --> 00:03:41,840 acids and most stuff follows from that 98 00:03:46,229 --> 00:03:43,599 and so here's an example of that where 99 00:03:48,789 --> 00:03:46,239 we look at the the frequency of proline 100 00:03:50,470 --> 00:03:48,799 um across all the pfam domains we've 101 00:03:53,830 --> 00:03:50,480 looked at and you can see there's a 102 00:03:56,550 --> 00:03:53,840 strong trend in brown among uh domains 103 00:03:59,429 --> 00:03:56,560 that have arisen in animals um it's much 104 00:04:02,550 --> 00:03:59,439 flatter in the green in in domains that 105 00:04:04,630 --> 00:04:02,560 arose in plants and relatively flat also 106 00:04:07,509 --> 00:04:04,640 among ancient domains of different 107 00:04:08,630 --> 00:04:07,519 levels of how ancient 108 00:04:10,789 --> 00:04:08,640 um 109 00:04:13,110 --> 00:04:10,799 and so if we take the slope of each of 110 00:04:15,030 --> 00:04:13,120 these and we plot the 20 slopes for the 111 00:04:17,189 --> 00:04:15,040 20 amino acids and we do this for the 112 00:04:19,110 --> 00:04:17,199 three most ancient groups 113 00:04:21,270 --> 00:04:19,120 what we see is there's a correlation 114 00:04:23,110 --> 00:04:21,280 with the hypothesized order in which the 115 00:04:25,270 --> 00:04:23,120 amino acids were recruited into the 116 00:04:27,830 --> 00:04:25,280 genetic code and what their slope is so 117 00:04:29,749 --> 00:04:27,840 what this is saying that the amino acids 118 00:04:32,870 --> 00:04:29,759 that were 119 00:04:35,110 --> 00:04:32,880 first 120 00:04:38,070 --> 00:04:35,120 are over represented 121 00:04:42,390 --> 00:04:38,080 in domains that date back to luca 122 00:04:43,749 --> 00:04:42,400 relative to other old domains um and we 123 00:04:45,030 --> 00:04:43,759 think the 124 00:04:47,270 --> 00:04:45,040 reason for this is even though the 125 00:04:49,990 --> 00:04:47,280 genetic code we're assuming had sort of 126 00:04:53,110 --> 00:04:50,000 settled down by luca that nevertheless 127 00:04:55,430 --> 00:04:53,120 at that point in time these amino acids 128 00:04:57,350 --> 00:04:55,440 remained much more available 129 00:05:00,070 --> 00:04:57,360 and so they were used more because of 130 00:05:02,150 --> 00:05:00,080 that and some of the other newer amino a 131 00:05:03,909 --> 00:05:02,160 amino acids were still somewhat oddities 132 00:05:06,230 --> 00:05:03,919 that were less available 133 00:05:08,550 --> 00:05:06,240 and we see the same bias towards using 134 00:05:10,629 --> 00:05:08,560 available amino acids 135 00:05:12,629 --> 00:05:10,639 in plants it's just it's a different set 136 00:05:14,950 --> 00:05:12,639 of amino acids that count as available 137 00:05:17,029 --> 00:05:14,960 in plants there's a lot of cysteine 138 00:05:19,670 --> 00:05:17,039 cellularly because it's produced during 139 00:05:21,350 --> 00:05:19,680 sulfur assimilation and it's also pretty 140 00:05:23,350 --> 00:05:21,360 important against reactive oxygen 141 00:05:25,749 --> 00:05:23,360 cysteine is very metabolically available 142 00:05:27,830 --> 00:05:25,759 and around glutamate and aspartate are 143 00:05:29,990 --> 00:05:27,840 also very abundant and those are the 144 00:05:32,710 --> 00:05:30,000 three amino acids that we see 145 00:05:34,870 --> 00:05:32,720 enriched in younger 146 00:05:37,110 --> 00:05:34,880 plant domains that when new stuff gets 147 00:05:38,629 --> 00:05:37,120 invented it tends to use what's most 148 00:05:40,790 --> 00:05:38,639 available 149 00:05:43,350 --> 00:05:40,800 situation is different in 150 00:05:45,830 --> 00:05:43,360 animals where we find more evidence that 151 00:05:47,590 --> 00:05:45,840 function is driving things so we we 152 00:05:49,670 --> 00:05:47,600 estimated um there was a big 153 00:05:52,550 --> 00:05:49,680 experimental evolution done in dtad 154 00:05:54,950 --> 00:05:52,560 towtz's lab where random peptides were 155 00:05:56,870 --> 00:05:54,960 were expressed in plasmids and the 156 00:05:59,670 --> 00:05:56,880 lineages were competed against each 157 00:06:01,749 --> 00:05:59,680 other and we calculated the marginal 158 00:06:03,189 --> 00:06:01,759 effect of having one amino acid versus 159 00:06:03,990 --> 00:06:03,199 another 160 00:06:06,550 --> 00:06:04,000 um 161 00:06:09,909 --> 00:06:06,560 and we found that those marginal effects 162 00:06:12,390 --> 00:06:09,919 correlated uh with um 163 00:06:14,790 --> 00:06:12,400 uh with these phyllo stratigraphy trends 164 00:06:17,590 --> 00:06:14,800 in animals so that young pro animal 165 00:06:19,830 --> 00:06:17,600 proteins tend to be using more harmless 166 00:06:22,550 --> 00:06:19,840 amino 167 00:06:25,350 --> 00:06:22,560 um and we actually see when we have 168 00:06:27,510 --> 00:06:25,360 another technique where we look at which 169 00:06:29,909 --> 00:06:27,520 amino acids are 170 00:06:32,309 --> 00:06:29,919 very slightly preferred basically in 171 00:06:34,230 --> 00:06:32,319 species that have stronger codon bias 172 00:06:36,629 --> 00:06:34,240 compared to species with less stronger 173 00:06:38,550 --> 00:06:36,639 codon bias so species that are able to 174 00:06:39,510 --> 00:06:38,560 make finer distinctions and those that 175 00:06:41,990 --> 00:06:39,520 aren't 176 00:06:44,550 --> 00:06:42,000 uh we find that the same 177 00:06:46,950 --> 00:06:44,560 amino acids are preferred today in 178 00:06:50,870 --> 00:06:46,960 vertebrates as are also 179 00:06:53,830 --> 00:06:50,880 preferred in this e coli experiment 180 00:06:55,830 --> 00:06:53,840 um one trend only we found to be 181 00:06:56,790 --> 00:06:55,840 consistent across the whole history of 182 00:06:59,589 --> 00:06:56,800 life 183 00:07:02,070 --> 00:06:59,599 and that is a a metric if you take the 184 00:07:03,830 --> 00:07:02,080 five most hydrophobic amino acids in 185 00:07:05,510 --> 00:07:03,840 some proteins like in the top here 186 00:07:07,830 --> 00:07:05,520 they're very clustered along the the 187 00:07:09,990 --> 00:07:07,840 primary sequence and in other amino 188 00:07:11,909 --> 00:07:10,000 acids they're very dispersed 189 00:07:14,830 --> 00:07:11,919 and there's a huge trend in this that 190 00:07:17,350 --> 00:07:14,840 goes back basically as far as we can 191 00:07:19,670 --> 00:07:17,360 reconstruct that um 192 00:07:22,150 --> 00:07:19,680 young genes are random and clustering 193 00:07:23,670 --> 00:07:22,160 value of one means it's basically random 194 00:07:26,469 --> 00:07:23,680 and genes that have had a long time to 195 00:07:28,710 --> 00:07:26,479 evolve uh we see this this more 196 00:07:30,790 --> 00:07:28,720 interspersion result where the 197 00:07:33,749 --> 00:07:30,800 hydrophobic and amino acids are less 198 00:07:36,309 --> 00:07:33,759 likely to be near one another 199 00:07:38,790 --> 00:07:36,319 when we see these trends there are sort 200 00:07:40,309 --> 00:07:38,800 of two mechanisms that that we think of 201 00:07:42,390 --> 00:07:40,319 what might be driving them and i think 202 00:07:44,390 --> 00:07:42,400 what most people immediately jump to is 203 00:07:45,830 --> 00:07:44,400 okay if older things 204 00:07:48,550 --> 00:07:45,840 have done something they've had more 205 00:07:50,230 --> 00:07:48,560 time to evolve and the classic process 206 00:07:52,629 --> 00:07:50,240 of evolution by descent with 207 00:07:54,629 --> 00:07:52,639 modification where alleles that are more 208 00:07:56,390 --> 00:07:54,639 in one direction take over from ours 209 00:07:58,390 --> 00:07:56,400 that aren't that this descent with 210 00:08:00,390 --> 00:07:58,400 modification drives it and somehow it's 211 00:08:01,510 --> 00:08:00,400 just so slow that it's taken all this 212 00:08:03,670 --> 00:08:01,520 time 213 00:08:05,830 --> 00:08:03,680 but another hypothesis is that 214 00:08:08,390 --> 00:08:05,840 everything was there originally with 215 00:08:11,029 --> 00:08:08,400 huge diversity but some things have been 216 00:08:13,189 --> 00:08:11,039 differentially lost so what we're seeing 217 00:08:15,350 --> 00:08:13,199 over longer longer periods of time is 218 00:08:17,029 --> 00:08:15,360 the survivors who are always like that 219 00:08:18,710 --> 00:08:17,039 even when they were born 220 00:08:19,830 --> 00:08:18,720 um but they're the ones who made the 221 00:08:22,150 --> 00:08:19,840 distance 222 00:08:23,589 --> 00:08:22,160 so we're currently trying to figure out 223 00:08:25,430 --> 00:08:23,599 which trends are driven by which of 224 00:08:27,029 --> 00:08:25,440 these mechanisms 225 00:08:28,710 --> 00:08:27,039 um and you know this is sort of when 226 00:08:30,629 --> 00:08:28,720 we're trying to think what luca was like 227 00:08:32,149 --> 00:08:30,639 you know just like we all know i think 228 00:08:35,110 --> 00:08:32,159 that most species that ever lived are 229 00:08:37,750 --> 00:08:35,120 now extinct the same is likely true for 230 00:08:40,389 --> 00:08:37,760 lucas protein domains most of them have 231 00:08:42,149 --> 00:08:40,399 no contemporary descendants we only 232 00:08:44,230 --> 00:08:42,159 study the ones that have contemporary 233 00:08:46,949 --> 00:08:44,240 descendants 234 00:08:49,829 --> 00:08:46,959 and so we use this this major maximum 235 00:08:52,630 --> 00:08:49,839 likelihood technique to attempt to uh 236 00:08:55,670 --> 00:08:52,640 quantify the rate of loss along total 237 00:08:57,910 --> 00:08:55,680 loss of a pfam domain across different 238 00:08:59,990 --> 00:08:57,920 lineages and what we find is a 239 00:09:02,150 --> 00:09:00,000 non-linear effect where there is an 240 00:09:04,389 --> 00:09:02,160 optimal value and this is shown here for 241 00:09:06,389 --> 00:09:04,399 the clustering metric and that optimal 242 00:09:09,110 --> 00:09:06,399 value with the lowest level of loss does 243 00:09:11,829 --> 00:09:09,120 indeed match that that you see in the 244 00:09:14,310 --> 00:09:11,839 very oldest pfams and this could help 245 00:09:17,430 --> 00:09:14,320 explain you know on the same lines 246 00:09:19,430 --> 00:09:17,440 what we see is we see greater variation 247 00:09:22,230 --> 00:09:19,440 among the younger domains and less 248 00:09:23,910 --> 00:09:22,240 variation among the older domains so you 249 00:09:25,670 --> 00:09:23,920 know this is really showing so some 250 00:09:28,470 --> 00:09:25,680 evidence that differential loss is 251 00:09:30,550 --> 00:09:28,480 driving some of this 252 00:09:32,550 --> 00:09:30,560 so to ask you know what was the early 253 00:09:34,790 --> 00:09:32,560 protein i'm like well we're still 254 00:09:37,269 --> 00:09:34,800 looking into it but sort of preliminary 255 00:09:39,030 --> 00:09:37,279 conclusions so far 256 00:09:40,310 --> 00:09:39,040 is firstly that the contemporary 257 00:09:41,829 --> 00:09:40,320 descendants are probably 258 00:09:43,990 --> 00:09:41,839 unrepresentative 259 00:09:46,470 --> 00:09:44,000 um they've had more time to evolve and 260 00:09:48,150 --> 00:09:46,480 they're a highly biased set of of 261 00:09:50,389 --> 00:09:48,160 descendants 262 00:09:52,870 --> 00:09:50,399 and so really looking at the field of de 263 00:09:55,509 --> 00:09:52,880 novo genes and what things get invented 264 00:09:57,350 --> 00:09:55,519 from scratch could be informative and we 265 00:09:58,870 --> 00:09:57,360 should consider the likelihood that 266 00:10:01,590 --> 00:09:58,880 there was a lot of that kind of thing 267 00:10:04,550 --> 00:10:01,600 around back in the ancient proteome and 268 00:10:07,670 --> 00:10:04,560 we just no longer see its descendants 269 00:10:09,670 --> 00:10:07,680 um and we also have some kind of hints 270 00:10:11,269 --> 00:10:09,680 that amino acids that were abundant back 271 00:10:13,910 --> 00:10:11,279 then were perhaps a bit more common than 272 00:10:15,190 --> 00:10:13,920 now in particular glycine alanine 273 00:10:16,949 --> 00:10:15,200 invalid 274 00:10:19,190 --> 00:10:16,959 so those are those are the preliminary 275 00:10:20,790 --> 00:10:19,200 conclusions as we continue 276 00:10:23,110 --> 00:10:20,800 to work on this 277 00:10:25,670 --> 00:10:23,120 um so thanks especially to the people in 278 00:10:28,630 --> 00:10:25,680 the top row who did uh this was a lot of 279 00:10:30,310 --> 00:10:28,640 work i compressed into this and also the 280 00:10:32,550 --> 00:10:30,320 people in the bottom row but you know 281 00:10:35,110 --> 00:10:32,560 their contributions and more people who 282 00:10:37,030 --> 00:10:35,120 who aren't even listed here um 283 00:10:40,470 --> 00:10:37,040 i try to put it in in as tightly as 284 00:10:48,150 --> 00:10:46,150 [Applause] 285 00:10:50,870 --> 00:10:48,160 thank you very much joanna we have a 286 00:10:53,190 --> 00:10:50,880 question for you from the audience 287 00:10:54,230 --> 00:10:53,200 hi this is anthony brunetti from georgia 288 00:10:56,630 --> 00:10:54,240 tech i 289 00:10:58,430 --> 00:10:56,640 saw you were seeing differences in the 290 00:11:00,389 --> 00:10:58,440 clustering of 291 00:11:02,870 --> 00:11:00,399 hydrophobicity in 292 00:11:05,269 --> 00:11:02,880 sequences identified as young and old 293 00:11:06,949 --> 00:11:05,279 um could that have anything to do with 294 00:11:10,710 --> 00:11:06,959 preferences for different kinds of 295 00:11:12,870 --> 00:11:10,720 secondary structure in old versus young 296 00:11:15,670 --> 00:11:12,880 domains 297 00:11:17,590 --> 00:11:15,680 um we don't think so so what we think 298 00:11:19,430 --> 00:11:17,600 this drives this 299 00:11:22,150 --> 00:11:19,440 is the uh you know proteins have to do 300 00:11:23,829 --> 00:11:22,160 two things they have to avoid doing harm 301 00:11:25,829 --> 00:11:23,839 they have to avoid aggregating and 302 00:11:27,190 --> 00:11:25,839 misfolding and so on and they also have 303 00:11:29,910 --> 00:11:27,200 to do good 304 00:11:32,630 --> 00:11:29,920 and what we believe is driving this is 305 00:11:34,630 --> 00:11:32,640 the avoidance of harm interestingly we 306 00:11:37,509 --> 00:11:34,640 weren't the first person to observe this 307 00:11:39,430 --> 00:11:37,519 this anti-clustering um but it was 308 00:11:42,790 --> 00:11:39,440 previously attributed 309 00:11:44,710 --> 00:11:42,800 to all proteins as a means of avoiding 310 00:11:46,310 --> 00:11:44,720 harm as a means of avoiding aggregation 311 00:11:48,630 --> 00:11:46,320 you know that having too many in a row 312 00:11:51,269 --> 00:11:48,640 during the translation process is going 313 00:11:53,350 --> 00:11:51,279 to increase the the chance of something 314 00:11:55,430 --> 00:11:53,360 going wrong at that point and what we 315 00:11:57,190 --> 00:11:55,440 found is that the that it's found 316 00:11:58,629 --> 00:11:57,200 different you know only in old proteins 317 00:12:03,670 --> 00:11:58,639 and not in yeah 318 00:12:07,269 --> 00:12:05,670 i think we have um a few minutes so i 319 00:12:10,470 --> 00:12:07,279 actually if you don't mind i want to 320 00:12:11,829 --> 00:12:10,480 maybe ask a question which was um in a 321 00:12:14,629 --> 00:12:11,839 number of your plots when you have an 322 00:12:17,269 --> 00:12:14,639 x-axis labeled age of the p-fam in 323 00:12:20,629 --> 00:12:17,279 billions of years i'm just curious what 324 00:12:22,470 --> 00:12:20,639 what metric do we use to 325 00:12:25,990 --> 00:12:22,480 what's our sort of way of inferring that 326 00:12:29,190 --> 00:12:26,000 or guessing that for a given p-fam 327 00:12:31,110 --> 00:12:29,200 yeah so the method we're using is to 328 00:12:34,710 --> 00:12:31,120 have a big tree of life and to see where 329 00:12:36,150 --> 00:12:34,720 the homologues um are detected and so in 330 00:12:39,350 --> 00:12:36,160 the older 331 00:12:42,710 --> 00:12:39,360 uh you know so for some of these uh 332 00:12:44,470 --> 00:12:42,720 younger pfabs that's relatively good we 333 00:12:46,629 --> 00:12:44,480 you know all of that these all come from 334 00:12:48,470 --> 00:12:46,639 time tree and the somewhat consensus 335 00:12:50,629 --> 00:12:48,480 estimates of when the diver you know 336 00:12:52,550 --> 00:12:50,639 based on the divergence of the species 337 00:12:55,110 --> 00:12:52,560 level um 338 00:12:57,110 --> 00:12:55,120 and then there's a lot more uncertainty 339 00:12:59,990 --> 00:12:57,120 obviously as you all know among these 340 00:13:01,590 --> 00:13:00,000 older age groups but the oldest is 341 00:13:04,629 --> 00:13:01,600 basically those who that have been 342 00:13:06,949 --> 00:13:04,639 attributed to being in in lucca 343 00:13:09,110 --> 00:13:06,959 um the ones up uh 344 00:13:11,430 --> 00:13:09,120 younger among these older ones are ones 345 00:13:14,230 --> 00:13:11,440 that are found both in eukaryote 346 00:13:16,069 --> 00:13:14,240 a fairly basal branch of eukaryotes and 347 00:13:17,670 --> 00:13:16,079 also at least plants and animals because 348 00:13:20,310 --> 00:13:17,680 we were doing a plant and animals 349 00:13:22,230 --> 00:13:20,320 focused study here and in between we 350 00:13:24,470 --> 00:13:22,240 have things that are found only you know 351 00:13:25,750 --> 00:13:24,480 in prokaryotes but aren't believed to be 352 00:13:27,430 --> 00:13:25,760 in luca 353 00:13:29,430 --> 00:13:27,440 and what numbers you want to give to 354 00:13:31,990 --> 00:13:29,440 these uh could definitely be open to 355 00:13:33,590 --> 00:13:32,000 interpretation 356 00:13:35,750 --> 00:13:33,600 lovely thank you 357 00:13:36,800 --> 00:13:35,760 please join me in thanking our first 358 00:13:41,430 --> 00:13:36,810 presenter 359 00:13:45,590 --> 00:13:43,430 um the second presentation will be by 360 00:13:47,990 --> 00:13:45,600 valeria giacobelli 361 00:13:49,670 --> 00:13:48,000 who is visiting us from the charles 362 00:13:52,790 --> 00:13:49,680 university of prague of the czech 363 00:13:55,670 --> 00:13:52,800 republic he is a postdoctoral fellow in 364 00:13:57,990 --> 00:13:55,680 the laboratory of clara huchovo 365 00:13:59,750 --> 00:13:58,000 hey hi everybody i'm valerio from 366 00:14:02,389 --> 00:13:59,760 charles university and i'm really 367 00:14:04,710 --> 00:14:02,399 thrilled today to show our recent work 368 00:14:07,509 --> 00:14:04,720 in vitro evolution reveal non-cationic 369 00:14:09,670 --> 00:14:07,519 protein rna interaction mediated by 370 00:14:11,910 --> 00:14:09,680 metal ions so 371 00:14:13,670 --> 00:14:11,920 briefly introduction i think we're 372 00:14:15,829 --> 00:14:13,680 how was the composition of 373 00:14:17,430 --> 00:14:15,839 probiotic work how we already like 374 00:14:18,389 --> 00:14:17,440 heavily discussed in this conference we 375 00:14:21,030 --> 00:14:18,399 know that 376 00:14:23,829 --> 00:14:21,040 two polymer mostly dominated the scene 377 00:14:25,590 --> 00:14:23,839 and was like peptide and rna 378 00:14:27,750 --> 00:14:25,600 and in some point so we can argue who 379 00:14:29,670 --> 00:14:27,760 was first the irony war peptide word but 380 00:14:32,230 --> 00:14:29,680 what we know that in some point of the 381 00:14:35,350 --> 00:14:32,240 evolution these two pop these two 382 00:14:36,870 --> 00:14:35,360 polymers interact each other and 383 00:14:38,870 --> 00:14:36,880 it's important to notice that the 384 00:14:39,990 --> 00:14:38,880 composition of the ancient peptide there 385 00:14:42,629 --> 00:14:40,000 are different theory about the 386 00:14:45,030 --> 00:14:42,639 composition of the ancient peptide we 387 00:14:47,350 --> 00:14:45,040 know that uh as really the previous talk 388 00:14:49,030 --> 00:14:47,360 described the composition of the amino 389 00:14:50,710 --> 00:14:49,040 acid composition of the peptide was 390 00:14:53,590 --> 00:14:50,720 different so we can 391 00:14:55,509 --> 00:14:53,600 probably was much easier alphabet than 392 00:14:57,350 --> 00:14:55,519 what we have now so we can distinguish 393 00:14:59,990 --> 00:14:57,360 like two classes of amino acid like 394 00:15:00,949 --> 00:15:00,000 early amino acid and late amino acid and 395 00:15:04,150 --> 00:15:00,959 we can 396 00:15:06,069 --> 00:15:04,160 also hear that mostly the amino acids 397 00:15:08,150 --> 00:15:06,079 are like the 398 00:15:10,389 --> 00:15:08,160 there are no positive charge but only 399 00:15:12,870 --> 00:15:10,399 negative charge and aliphatic one so how 400 00:15:14,710 --> 00:15:12,880 it's possible that in probiotic war like 401 00:15:17,990 --> 00:15:14,720 negative charge molecules can interact 402 00:15:20,230 --> 00:15:18,000 each other uh or hypothetical like how 403 00:15:22,310 --> 00:15:20,240 can be this interaction between rna and 404 00:15:25,350 --> 00:15:22,320 ancient peptide there are two 405 00:15:27,430 --> 00:15:25,360 hypothetical like mechanism one it's the 406 00:15:29,749 --> 00:15:27,440 most study in its presence still like in 407 00:15:31,110 --> 00:15:29,759 the modern cells it's like through 408 00:15:32,870 --> 00:15:31,120 electrostatic interaction between 409 00:15:34,710 --> 00:15:32,880 positive and negative the positive 410 00:15:37,509 --> 00:15:34,720 charge of like 411 00:15:39,350 --> 00:15:37,519 arginine lysine and phosphate backbone 412 00:15:40,949 --> 00:15:39,360 and in the case of probiotic board could 413 00:15:42,550 --> 00:15:40,959 be possible that 414 00:15:44,310 --> 00:15:42,560 not arginine was not present there but 415 00:15:47,189 --> 00:15:44,320 was some non-canonic amino acid that 416 00:15:49,509 --> 00:15:47,199 during the evolution just disappear or 417 00:15:51,670 --> 00:15:49,519 another theory that it's what i'm going 418 00:15:53,350 --> 00:15:51,680 to talk today that it's just for the 419 00:15:55,590 --> 00:15:53,360 moment hypothetical that this 420 00:15:58,550 --> 00:15:55,600 interaction between post between 421 00:16:02,389 --> 00:15:58,560 negative charge polymers can be mediated 422 00:16:03,590 --> 00:16:02,399 by metal ions in particularly magnesium 423 00:16:06,150 --> 00:16:03,600 um 424 00:16:07,829 --> 00:16:06,160 how we try to verify this hypothesis 425 00:16:11,189 --> 00:16:07,839 like first of all we select like a 426 00:16:12,629 --> 00:16:11,199 template uh rna binding protein and try 427 00:16:14,470 --> 00:16:12,639 and substitute a 428 00:16:16,550 --> 00:16:14,480 create a library where all the late 429 00:16:18,470 --> 00:16:16,560 amino acids were substituted with the 430 00:16:20,310 --> 00:16:18,480 early amino acid so we have a protein 431 00:16:21,829 --> 00:16:20,320 composed of all the early amino acids 432 00:16:24,870 --> 00:16:21,839 and we will try to understand if he's 433 00:16:27,030 --> 00:16:24,880 still able to bind the rna 434 00:16:29,110 --> 00:16:27,040 so the target we selected was the 435 00:16:31,829 --> 00:16:29,120 ribosomal protein the 436 00:16:35,670 --> 00:16:31,839 c-terminal of the ribosomal protein l11 437 00:16:37,189 --> 00:16:35,680 from geobasiluk geobasilus 438 00:16:39,829 --> 00:16:37,199 we select this target because it was 439 00:16:42,389 --> 00:16:39,839 small domain 80 amino acids so simple to 440 00:16:45,110 --> 00:16:42,399 manage especially from the to manage the 441 00:16:47,509 --> 00:16:45,120 library from from this one uh already 442 00:16:49,670 --> 00:16:47,519 reach of early amino acids more than 74 443 00:16:51,749 --> 00:16:49,680 percent was already early amino acid we 444 00:16:54,470 --> 00:16:51,759 know everything about it it's conserved 445 00:16:57,350 --> 00:16:54,480 we know the crystal structure and uh and 446 00:16:59,670 --> 00:16:57,360 we know we know like the the rna target 447 00:17:03,509 --> 00:16:59,680 so the the target that this protein bind 448 00:17:05,750 --> 00:17:03,519 this rna binding protein bind and 449 00:17:07,829 --> 00:17:05,760 after that we create like um 450 00:17:10,949 --> 00:17:07,839 we generate our library so where every 451 00:17:13,189 --> 00:17:10,959 late amino acid will randomize with the 452 00:17:15,029 --> 00:17:13,199 set of early one here we can see and the 453 00:17:17,029 --> 00:17:15,039 end we obtained like a library of the 454 00:17:19,669 --> 00:17:17,039 size around 10 to the power of 10 455 00:17:21,110 --> 00:17:19,679 variants now with this kind of big 456 00:17:23,110 --> 00:17:21,120 library 457 00:17:25,029 --> 00:17:23,120 we have to select we have to select the 458 00:17:27,270 --> 00:17:25,039 variant and verify if there is something 459 00:17:29,590 --> 00:17:27,280 that's able to still bind dna and the 460 00:17:32,150 --> 00:17:29,600 method that we selected for the for this 461 00:17:34,390 --> 00:17:32,160 purpose but the mrna display 462 00:17:35,990 --> 00:17:34,400 quickly the amino display is like a 463 00:17:38,549 --> 00:17:36,000 technique a selection method that binds 464 00:17:41,029 --> 00:17:38,559 together the genotype and the phenotype 465 00:17:42,870 --> 00:17:41,039 through puromycin so we have like we can 466 00:17:44,870 --> 00:17:42,880 select the function through the protein 467 00:17:47,750 --> 00:17:44,880 that is bind to the 468 00:17:49,430 --> 00:17:47,760 to its own mrna so we can sequence so 469 00:17:52,710 --> 00:17:49,440 once we selected one variant we can 470 00:17:54,789 --> 00:17:52,720 sequencing the sequence uh through mrna 471 00:17:56,630 --> 00:17:54,799 uh here it's described the general 472 00:17:59,510 --> 00:17:56,640 pipeline of the method so we have like 473 00:18:02,230 --> 00:17:59,520 the dna library we in vitro transcribe 474 00:18:04,390 --> 00:18:02,240 and legated to the pulaomycin molecules 475 00:18:06,230 --> 00:18:04,400 and after in cell free so without cells 476 00:18:08,310 --> 00:18:06,240 so i can just in the 477 00:18:10,710 --> 00:18:08,320 in vitro we translate it and we obtain 478 00:18:12,150 --> 00:18:10,720 the protein libraries linked to the rna 479 00:18:15,029 --> 00:18:12,160 and after 480 00:18:17,909 --> 00:18:15,039 we selected the we mobilized the rna 481 00:18:20,230 --> 00:18:17,919 target to to beat to a solid support and 482 00:18:22,310 --> 00:18:20,240 we selected the variant this cycle this 483 00:18:25,350 --> 00:18:22,320 technique it's repeated for like seven 484 00:18:28,630 --> 00:18:25,360 around in this case we perform 60 round 485 00:18:30,950 --> 00:18:28,640 and on the right we can 486 00:18:33,430 --> 00:18:30,960 we can see we're sequencing every round 487 00:18:35,830 --> 00:18:33,440 and we can see the arrangement of every 488 00:18:37,190 --> 00:18:35,840 in every position of the mutagenesis and 489 00:18:38,870 --> 00:18:37,200 we can see 490 00:18:41,830 --> 00:18:38,880 in every position of the library and we 491 00:18:44,470 --> 00:18:41,840 can see that step by step we selected 492 00:18:45,270 --> 00:18:44,480 the the population was a reached of like 493 00:18:47,029 --> 00:18:45,280 uh 494 00:18:48,870 --> 00:18:47,039 negative charge amino acid we can see 495 00:18:51,029 --> 00:18:48,880 like how the presence of aspartating and 496 00:18:52,710 --> 00:18:51,039 glutamine increase during the selection 497 00:18:55,110 --> 00:18:52,720 till we arrive to the last end when we 498 00:18:57,830 --> 00:18:55,120 select one variant the most abundant in 499 00:18:59,750 --> 00:18:57,840 the in the mix and um 500 00:19:01,750 --> 00:18:59,760 we select this one let's go like 501 00:19:04,070 --> 00:19:01,760 e variant 502 00:19:06,310 --> 00:19:04,080 uh after that that we have our variant 503 00:19:08,470 --> 00:19:06,320 we need to prove it so we express in 504 00:19:09,510 --> 00:19:08,480 nicolite purify it and verify the 505 00:19:11,669 --> 00:19:09,520 binding 506 00:19:14,710 --> 00:19:11,679 comparison to the wall type protein so 507 00:19:16,950 --> 00:19:14,720 we have a scale we have like 508 00:19:18,870 --> 00:19:16,960 a comparison and we perform different 509 00:19:21,350 --> 00:19:18,880 technique to verify the binding one of 510 00:19:23,510 --> 00:19:21,360 them was the amsa the electrophoretic 511 00:19:26,549 --> 00:19:23,520 mobility shift ac where we load on a 512 00:19:28,310 --> 00:19:26,559 native gel page gel like the free rna 513 00:19:30,950 --> 00:19:28,320 and the error in potential in the 514 00:19:33,270 --> 00:19:30,960 complex and we can just 515 00:19:34,390 --> 00:19:33,280 see the shift between these two 516 00:19:36,150 --> 00:19:34,400 um 517 00:19:37,990 --> 00:19:36,160 between both the complex and the free 518 00:19:40,150 --> 00:19:38,000 rna and we can observe that the 519 00:19:42,470 --> 00:19:40,160 e-variant compared to wall type showed 520 00:19:44,230 --> 00:19:42,480 the same binding at least it might 521 00:19:45,350 --> 00:19:44,240 uh after that we were curious to know 522 00:19:47,350 --> 00:19:45,360 how is the 523 00:19:49,909 --> 00:19:47,360 the structure the general structure of 524 00:19:52,230 --> 00:19:49,919 the protein uh in solution not binding 525 00:19:54,549 --> 00:19:52,240 and we can see that this mutation the er 526 00:19:56,310 --> 00:19:54,559 e-variants uh lost completely the 527 00:19:58,310 --> 00:19:56,320 secondary structure compared to the wall 528 00:20:01,430 --> 00:19:58,320 type that was like most of filix and how 529 00:20:03,029 --> 00:20:01,440 we can see the e-library the e-variants 530 00:20:05,149 --> 00:20:03,039 show like a 531 00:20:07,510 --> 00:20:05,159 peak around 200 nanometer in the 532 00:20:10,789 --> 00:20:07,520 circulation technique that show like 533 00:20:12,789 --> 00:20:10,799 that it's like highly disorder 534 00:20:15,590 --> 00:20:12,799 after that we try to quantify give some 535 00:20:18,149 --> 00:20:15,600 number about the binding so we perform 536 00:20:21,190 --> 00:20:18,159 the spr as sulfate plasmas resonance 537 00:20:23,830 --> 00:20:21,200 technique where we immobilize the 538 00:20:25,750 --> 00:20:23,840 the target rna on a chip and just pass 539 00:20:27,990 --> 00:20:25,760 on it like the protein the two protein 540 00:20:31,830 --> 00:20:28,000 dual type invariant and we calculated 541 00:20:35,110 --> 00:20:31,840 the association and dissociation binding 542 00:20:38,549 --> 00:20:35,120 constant and we can see that the 543 00:20:41,029 --> 00:20:38,559 e-variant uh bind much slower to the 544 00:20:44,149 --> 00:20:41,039 target but on the other hand compared to 545 00:20:46,710 --> 00:20:44,159 the wall type but once the the 546 00:20:48,470 --> 00:20:46,720 the protein bind the rna 547 00:20:51,110 --> 00:20:48,480 it's more stable the complex it's more 548 00:20:54,470 --> 00:20:51,120 stable the overall kde that is duration 549 00:20:56,310 --> 00:20:54,480 between on and off it's mostly similar 550 00:20:58,149 --> 00:20:56,320 to the to wall type but the difference 551 00:20:59,990 --> 00:20:58,159 is mostly in the dissociation and 552 00:21:01,750 --> 00:21:00,000 actually this is a it suggests that 553 00:21:04,390 --> 00:21:01,760 maybe the evolution during the evolution 554 00:21:06,149 --> 00:21:04,400 like something so stable on the rna it's 555 00:21:08,070 --> 00:21:06,159 not so advantageous if we imagine like a 556 00:21:10,070 --> 00:21:08,080 ribosome or whatever or every mechanism 557 00:21:11,830 --> 00:21:10,080 in the cell it's something dynamic but 558 00:21:13,830 --> 00:21:11,840 here we have something that once it's 559 00:21:15,909 --> 00:21:13,840 bind it stay there so maybe the 560 00:21:18,230 --> 00:21:15,919 evolution also select this one to 561 00:21:21,110 --> 00:21:18,240 towards something more dynamic 562 00:21:23,669 --> 00:21:21,120 uh another a fire characterization was 563 00:21:26,549 --> 00:21:23,679 done by uh pull down technique so we 564 00:21:28,390 --> 00:21:26,559 mobilized the complex on a bit support 565 00:21:31,190 --> 00:21:28,400 and changing the parameter like 566 00:21:33,590 --> 00:21:31,200 temperature ph or the presence of iron 567 00:21:35,430 --> 00:21:33,600 we can destabilize or not the complex if 568 00:21:37,510 --> 00:21:35,440 the complex is destabilized the protein 569 00:21:39,270 --> 00:21:37,520 get released and we have a signal on 570 00:21:41,190 --> 00:21:39,280 western blot 571 00:21:43,350 --> 00:21:41,200 we notice that compared to the wall type 572 00:21:44,390 --> 00:21:43,360 the e-variant is much sensitive to 573 00:21:46,950 --> 00:21:44,400 temperature 574 00:21:49,270 --> 00:21:46,960 and it's pretty extreme ph but what was 575 00:21:50,789 --> 00:21:49,280 really interesting it was uh 576 00:21:53,350 --> 00:21:50,799 really interesting was that in the 577 00:21:55,750 --> 00:21:53,360 absence of completely iron or metal ions 578 00:21:58,710 --> 00:21:55,760 so in the buffer was just buffer 579 00:22:01,029 --> 00:21:58,720 the complex was destabilized 580 00:22:03,270 --> 00:22:01,039 but this did happen in the case of the 581 00:22:05,830 --> 00:22:03,280 wall type so it means that these these 582 00:22:10,310 --> 00:22:05,840 ions were involved in somehow in 583 00:22:15,510 --> 00:22:13,110 to give fire suggestions like proof to 584 00:22:17,270 --> 00:22:15,520 this theory uh we perform in 585 00:22:18,470 --> 00:22:17,280 collaboration with the academia of 586 00:22:20,789 --> 00:22:18,480 science of czech republic in czech 587 00:22:21,990 --> 00:22:20,799 republic uh the molecular dynamics 588 00:22:24,149 --> 00:22:22,000 simulation 589 00:22:25,830 --> 00:22:24,159 uh we use as template the the crystal 590 00:22:27,990 --> 00:22:25,840 structure of the complex of the wall 591 00:22:28,870 --> 00:22:28,000 type that it's available it's a pdb it's 592 00:23:17,990 --> 00:22:28,880 a 593 00:23:20,070 --> 00:23:18,000 diet given like further proof to this 594 00:23:21,350 --> 00:23:20,080 experimental data that metal ion 595 00:23:23,430 --> 00:23:21,360 actually 596 00:23:27,590 --> 00:23:23,440 help to the the interface to bind 597 00:23:31,990 --> 00:23:29,830 so in conclusion uh 598 00:23:34,070 --> 00:23:32,000 first of all we demonstrate that an 599 00:23:37,750 --> 00:23:34,080 early protein composite of only early 600 00:23:39,830 --> 00:23:37,760 amino acid is still able to bind the rna 601 00:23:41,510 --> 00:23:39,840 and second for the first time we give 602 00:23:44,310 --> 00:23:41,520 like for the first the first 603 00:23:46,470 --> 00:23:44,320 experimental indication that cat on ion 604 00:23:48,870 --> 00:23:46,480 like magnesium can really help the 605 00:23:51,269 --> 00:23:48,880 interaction between rna and protein that 606 00:23:53,830 --> 00:23:51,279 can also be possible in the in modern 607 00:23:56,230 --> 00:23:53,840 world maybe just we didn't look at it 608 00:23:57,510 --> 00:23:56,240 but it's still possible it's another way 609 00:23:59,510 --> 00:23:57,520 of interaction 610 00:24:01,669 --> 00:23:59,520 and third one third 611 00:24:04,230 --> 00:24:01,679 third we can say that 612 00:24:07,350 --> 00:24:04,240 a word the probiotic word without late 613 00:24:10,149 --> 00:24:07,360 amino acid was possible and probably the 614 00:24:12,230 --> 00:24:10,159 they were inserted inside the evolution 615 00:24:15,190 --> 00:24:12,240 because just to help to fine-tune the 616 00:24:16,870 --> 00:24:15,200 interaction between rna and protein just 617 00:24:19,830 --> 00:24:16,880 to make everything more dynamic but 618 00:24:21,909 --> 00:24:19,840 anyway their absence still like 619 00:24:24,149 --> 00:24:21,919 even without them the the rna was still 620 00:24:26,230 --> 00:24:24,159 possible to interact with protein 621 00:24:28,230 --> 00:24:26,240 this work was published on molecular 622 00:24:30,390 --> 00:24:28,240 biology at evolution journal where we 623 00:24:32,310 --> 00:24:30,400 also got the cover and 624 00:24:34,070 --> 00:24:32,320 in this query code you can find the the 625 00:24:36,149 --> 00:24:34,080 paper if you want to read there are much 626 00:24:37,990 --> 00:24:36,159 more detail like scientific detail about 627 00:24:40,230 --> 00:24:38,000 experiment about the binding the 628 00:24:42,710 --> 00:24:40,240 structure and whatever and i want to 629 00:24:45,269 --> 00:24:42,720 really thanks like my colleague clara 630 00:24:47,110 --> 00:24:45,279 okova groups and 631 00:24:50,210 --> 00:24:47,120 and an ola collaborator and your for 632 00:24:56,310 --> 00:24:50,220 your attention thank you 633 00:24:59,750 --> 00:24:58,390 brilliant thank you valeria 634 00:25:06,950 --> 00:24:59,760 do you have any questions from the 635 00:25:11,190 --> 00:25:08,950 hi uh jessica bowman from georgia tech 636 00:25:12,710 --> 00:25:11,200 that was a super interesting talk um i'm 637 00:25:15,430 --> 00:25:12,720 from the lab 638 00:25:19,909 --> 00:25:18,070 and we are frequently looking at 639 00:25:22,630 --> 00:25:19,919 protein and rna interactions 640 00:25:25,190 --> 00:25:22,640 specifically rna from the ribosome 641 00:25:26,789 --> 00:25:25,200 one of your conclusions indicated that 642 00:25:29,590 --> 00:25:26,799 this is the first known 643 00:25:32,870 --> 00:25:29,600 interaction between a protein and 644 00:25:35,590 --> 00:25:32,880 ribosomal rna that's magnesium mediated 645 00:25:38,149 --> 00:25:35,600 if i recall correctly chelang shaw of 646 00:25:41,750 --> 00:25:38,159 our group published 647 00:25:46,789 --> 00:25:44,230 uh ribosomal protein 648 00:25:48,470 --> 00:25:46,799 in the rna that is magnesium mediated by 649 00:25:49,830 --> 00:25:48,480 an am n 650 00:25:51,990 --> 00:25:49,840 conserved 651 00:25:53,830 --> 00:25:52,000 region in that ult protein 652 00:25:56,310 --> 00:25:53,840 just a comment 653 00:25:57,750 --> 00:25:56,320 yeah actually we we also working on it 654 00:25:59,110 --> 00:25:57,760 like it's a parallel pro it's not my 655 00:26:01,110 --> 00:25:59,120 project but our colleague we are 656 00:26:02,710 --> 00:26:01,120 studying about this and like also yeah 657 00:26:04,549 --> 00:26:02,720 we noticed that in especially in the 658 00:26:06,549 --> 00:26:04,559 ribosome the presence of magnesium it's 659 00:26:09,430 --> 00:26:06,559 important to stabilize this 660 00:26:11,669 --> 00:26:09,440 so we can also fit this this model in in 661 00:26:13,350 --> 00:26:11,679 the recent world like of the ribosome so 662 00:26:15,750 --> 00:26:13,360 yeah 663 00:26:17,830 --> 00:26:15,760 thank you just one other comment 664 00:26:19,669 --> 00:26:17,840 interestingly in that case we have a 665 00:26:20,950 --> 00:26:19,679 later paper also i think chao longsha 666 00:26:23,029 --> 00:26:20,960 was the first author 667 00:26:24,710 --> 00:26:23,039 um that demonstrated 668 00:26:27,510 --> 00:26:24,720 interactions between 669 00:26:30,230 --> 00:26:27,520 a proposed ancestral ribosomal rna and 670 00:26:32,390 --> 00:26:30,240 some of these um ancestral peptides or 671 00:26:36,310 --> 00:26:32,400 hypothesized ancestral peptides one of 672 00:26:37,269 --> 00:26:36,320 which was ul2 we looked at uo2 ul3 ul4 673 00:26:39,669 --> 00:26:37,279 and 674 00:26:41,430 --> 00:26:39,679 what was interesting is that most of 675 00:26:43,430 --> 00:26:41,440 those um 676 00:26:45,350 --> 00:26:43,440 interactions were not magnesium was 677 00:26:47,350 --> 00:26:45,360 shown to disrupt the interaction between 678 00:26:49,269 --> 00:26:47,360 the protein and the rna in the case of 679 00:26:50,549 --> 00:26:49,279 ul2 680 00:26:52,549 --> 00:26:50,559 so just 681 00:26:56,950 --> 00:26:52,559 we can talk afterwards yeah sure sure 682 00:27:01,350 --> 00:26:59,510 chris may or bacon um university of 683 00:27:03,350 --> 00:27:01,360 maryland baltimore county very 684 00:27:06,230 --> 00:27:03,360 interesting talk 685 00:27:09,430 --> 00:27:06,240 in one of your slides you mentioned that 686 00:27:11,669 --> 00:27:09,440 the absence of magnesium 687 00:27:13,750 --> 00:27:11,679 or potassium 688 00:27:17,669 --> 00:27:13,760 disrupted the 689 00:27:20,710 --> 00:27:17,679 rna the rna binding and you showed md 690 00:27:22,789 --> 00:27:20,720 simulations about the role of magnesium 691 00:27:26,070 --> 00:27:22,799 i'm curious if 692 00:27:27,590 --> 00:27:26,080 i'm curious where the role of potassium 693 00:27:30,230 --> 00:27:27,600 ions come in 694 00:27:31,990 --> 00:27:30,240 in a stabilizing the rna protein 695 00:27:34,389 --> 00:27:32,000 interaction 696 00:27:36,310 --> 00:27:34,399 yeah okay i didn't perform like by 697 00:27:38,470 --> 00:27:36,320 myself was in collaboration but i know 698 00:27:41,029 --> 00:27:38,480 that like actually the first simulation 699 00:27:43,350 --> 00:27:41,039 was through uh potassium ion actually 700 00:27:45,350 --> 00:27:43,360 and they were like stuck there and after 701 00:27:47,669 --> 00:27:45,360 uh it substitute like the potassium with 702 00:27:50,070 --> 00:27:47,679 magnesium and it like confirmed this 703 00:27:51,909 --> 00:27:50,080 data so also the potassium ion were like 704 00:27:54,389 --> 00:27:51,919 present in the in the in the first 705 00:27:56,789 --> 00:27:54,399 simulation in the um in the structure in 706 00:27:58,310 --> 00:27:56,799 the interface 707 00:27:59,669 --> 00:27:58,320 all right interesting thank you all 708 00:28:01,350 --> 00:27:59,679 right i'm afraid we might we need to 709 00:28:03,669 --> 00:28:01,360 move on but we do have some time at the 710 00:28:06,470 --> 00:28:03,679 end for extra discussion so i apologize 711 00:28:09,510 --> 00:28:06,480 to the third questioner thank you um our 712 00:28:10,389 --> 00:28:09,520 third presentation is going to be from 713 00:28:13,190 --> 00:28:10,399 um 714 00:28:15,590 --> 00:28:13,200 uh dr pratik vias who 715 00:28:19,430 --> 00:28:15,600 is joining us remotely from the weizmann 716 00:28:20,870 --> 00:28:19,440 institute of science in rehovot israel 717 00:28:34,149 --> 00:28:20,880 he is a 718 00:28:37,510 --> 00:28:36,549 hi hi stephen and thank you for 719 00:28:38,950 --> 00:28:37,520 um 720 00:28:41,590 --> 00:28:38,960 having me 721 00:28:42,950 --> 00:28:41,600 um so basically my research like the 722 00:28:45,430 --> 00:28:42,960 broad goal of my research is to 723 00:28:46,630 --> 00:28:45,440 understand how did the first enzymes 724 00:28:47,430 --> 00:28:46,640 evolve 725 00:28:50,230 --> 00:28:47,440 and 726 00:28:52,149 --> 00:28:50,240 if like enzyme evolution basically it 727 00:28:54,789 --> 00:28:52,159 relates to recruitment of pre-existing 728 00:28:57,430 --> 00:28:54,799 enzymes to perform new function by a 729 00:28:59,190 --> 00:28:57,440 series of mutations and selections 730 00:29:01,669 --> 00:28:59,200 this is synonymous to teaching an old 731 00:29:02,950 --> 00:29:01,679 dog new tricks like enzymes being the 732 00:29:04,950 --> 00:29:02,960 old dogs 733 00:29:06,950 --> 00:29:04,960 but the key question in the field is 734 00:29:09,269 --> 00:29:06,960 that how and where did the old dog come 735 00:29:10,630 --> 00:29:09,279 about in the first place 736 00:29:12,070 --> 00:29:10,640 because if you look at the modern day 737 00:29:14,549 --> 00:29:12,080 proteins we know that they are 738 00:29:17,029 --> 00:29:14,559 incredibly complex and yet it tends to 739 00:29:20,310 --> 00:29:17,039 reason that in the pre-luca world these 740 00:29:22,630 --> 00:29:20,320 complex proteins likely emerged from 741 00:29:24,310 --> 00:29:22,640 precursors that were much more simpler 742 00:29:25,350 --> 00:29:24,320 both in terms of the sequence and the 743 00:29:26,950 --> 00:29:25,360 structure 744 00:29:29,350 --> 00:29:26,960 so what were these precursors of these 745 00:29:31,190 --> 00:29:29,360 complex proteins what kind of function 746 00:29:33,430 --> 00:29:31,200 did they possess what kind of structure 747 00:29:34,870 --> 00:29:33,440 did they possess and can we relate their 748 00:29:36,230 --> 00:29:34,880 structure and function to their modern 749 00:29:37,430 --> 00:29:36,240 day counterparts 750 00:29:40,230 --> 00:29:37,440 these are the questions that i'm trying 751 00:29:42,470 --> 00:29:40,240 to address in my in my work and 752 00:29:43,830 --> 00:29:42,480 specifically i'm trying to understand 753 00:29:45,590 --> 00:29:43,840 experimentally 754 00:29:48,950 --> 00:29:45,600 what were the precursors of this family 755 00:29:51,510 --> 00:29:48,960 of enzymes known as the t-loop ntp asses 756 00:29:53,669 --> 00:29:51,520 so the p-loop and the phases are one of 757 00:29:55,269 --> 00:29:53,679 the most diverse and abundant protein 758 00:29:56,950 --> 00:29:55,279 families that we know of 759 00:29:59,350 --> 00:29:56,960 these include 760 00:30:01,350 --> 00:29:59,360 uh complex macromolecular machines such 761 00:30:03,669 --> 00:30:01,360 as the atp synthesis 762 00:30:05,269 --> 00:30:03,679 regulatory combinations helicases and 763 00:30:07,269 --> 00:30:05,279 many other proteins that are implicated 764 00:30:10,230 --> 00:30:07,279 in essential life processes 765 00:30:12,149 --> 00:30:10,240 and also p loop ntps are one of the most 766 00:30:14,389 --> 00:30:12,159 ancient protein families that we know of 767 00:30:16,870 --> 00:30:14,399 and these are unambiguously assigned the 768 00:30:19,350 --> 00:30:16,880 last universal common ancestor 769 00:30:21,590 --> 00:30:19,360 so both these attributes make the p loop 770 00:30:23,669 --> 00:30:21,600 ndpas is attractive candidates to study 771 00:30:25,830 --> 00:30:23,679 protein evolution 772 00:30:28,149 --> 00:30:25,840 so in in all the peel panty pages the 773 00:30:29,909 --> 00:30:28,159 critical element is the walker a moti 774 00:30:32,230 --> 00:30:29,919 for the p-loop motif which is 775 00:30:34,950 --> 00:30:32,240 essentially a glycine rich loop 776 00:30:37,430 --> 00:30:34,960 that is embedded in a beta loop alpha 777 00:30:39,590 --> 00:30:37,440 element and the glycine which look 778 00:30:41,510 --> 00:30:39,600 mainly via the g k and the t that 779 00:30:44,870 --> 00:30:41,520 resides at the tip of the helix and also 780 00:30:47,350 --> 00:30:44,880 by the glycines it interacts with ntp 781 00:30:49,510 --> 00:30:47,360 the phosphates of ntp such as atp and 782 00:30:51,750 --> 00:30:49,520 gtp and mediates the transfer of the 783 00:30:55,269 --> 00:30:51,760 terminal phosphoryl group in reactions 784 00:30:58,310 --> 00:30:55,279 such as atp hydrolysis or atp synthesis 785 00:31:00,710 --> 00:30:58,320 so this extended beta philu palpa motif 786 00:31:03,350 --> 00:31:00,720 underlines all the enzymes that belong 787 00:31:05,990 --> 00:31:03,360 to the p loop ntps family 788 00:31:09,190 --> 00:31:06,000 and structurally the the core domain of 789 00:31:11,509 --> 00:31:09,200 p-loop ntps comprises of a three-layer 790 00:31:14,310 --> 00:31:11,519 alpha beta alpha sandwich-like 791 00:31:17,029 --> 00:31:14,320 architecture and almost always the 792 00:31:18,470 --> 00:31:17,039 p-loop is a part of the first beta loop 793 00:31:20,070 --> 00:31:18,480 and alpha element 794 00:31:22,310 --> 00:31:20,080 so with this background and with this 795 00:31:23,509 --> 00:31:22,320 structural information it brings me back 796 00:31:26,070 --> 00:31:23,519 to the question that i'm trying to 797 00:31:28,389 --> 00:31:26,080 address is that what were the precursors 798 00:31:29,990 --> 00:31:28,399 of e-loop ntps 799 00:31:31,990 --> 00:31:30,000 so to answer this question we need to 800 00:31:33,669 --> 00:31:32,000 first understand or we need to first ask 801 00:31:35,269 --> 00:31:33,679 is that what do these precursors 802 00:31:36,710 --> 00:31:35,279 actually do what kind of functions would 803 00:31:38,789 --> 00:31:36,720 they possess 804 00:31:41,669 --> 00:31:38,799 so previously it has been shown by liam 805 00:31:43,190 --> 00:31:41,679 longo who is also one of the speakers in 806 00:31:44,149 --> 00:31:43,200 today's session 807 00:31:46,549 --> 00:31:44,159 is that 808 00:31:48,310 --> 00:31:46,559 binding to phosphate containing ligands 809 00:31:50,789 --> 00:31:48,320 was one of the founding function or one 810 00:31:53,509 --> 00:31:50,799 of the ancient functions of not only the 811 00:31:55,509 --> 00:31:53,519 p loop entities but also by also of many 812 00:31:57,029 --> 00:31:55,519 other evolutionary ancient 813 00:31:59,029 --> 00:31:57,039 families such as the rosmans and the 814 00:32:01,430 --> 00:31:59,039 plebitoxin 815 00:32:03,590 --> 00:32:01,440 and in all these ancient families 816 00:32:06,310 --> 00:32:03,600 phosphate binding is realized by a 817 00:32:08,630 --> 00:32:06,320 stretch of simple abiotic amino acids 818 00:32:10,389 --> 00:32:08,640 such as glycine serine and threonine 819 00:32:12,710 --> 00:32:10,399 that reside at the end terminal tip of 820 00:32:15,509 --> 00:32:12,720 the helix and this interaction this 821 00:32:17,350 --> 00:32:15,519 phosphate binding interaction is via a 822 00:32:19,190 --> 00:32:17,360 wide-ended backbone interaction as well 823 00:32:21,029 --> 00:32:19,200 as a side chin interaction 824 00:32:22,870 --> 00:32:21,039 so we established that we concluded that 825 00:32:25,430 --> 00:32:22,880 phosphate binding functions was one of 826 00:32:28,310 --> 00:32:25,440 the ancient founding functions of of p 827 00:32:30,870 --> 00:32:28,320 loop n tps and this is what we set to 828 00:32:33,909 --> 00:32:30,880 assess if the ancient precursors of p 829 00:32:36,470 --> 00:32:33,919 loop and t phases can bind phosphates 830 00:32:37,509 --> 00:32:36,480 so before that our our our hypothesis 831 00:32:39,590 --> 00:32:37,519 was that 832 00:32:41,750 --> 00:32:39,600 the the beta p loop alpha motif that i 833 00:32:45,029 --> 00:32:41,760 just mentioned was one of the or was 834 00:32:46,950 --> 00:32:45,039 rather the the earliest standalone seed 835 00:32:49,509 --> 00:32:46,960 segment which then underwent 836 00:32:52,470 --> 00:32:49,519 self-assembly duplication and fusion to 837 00:32:54,710 --> 00:32:52,480 give rise to modern day peru ntpas 838 00:32:56,470 --> 00:32:54,720 and to test this strategy 839 00:32:58,630 --> 00:32:56,480 we use the static uh to test this 840 00:33:00,870 --> 00:32:58,640 hypothesis we use a strategy where we 841 00:33:03,110 --> 00:33:00,880 construct uh prototypes which are 842 00:33:04,950 --> 00:33:03,120 essentially mimics of ancient p loop 843 00:33:06,789 --> 00:33:04,960 entities 844 00:33:07,830 --> 00:33:06,799 so essentially what we do over here is 845 00:33:09,909 --> 00:33:07,840 we take 846 00:33:11,830 --> 00:33:09,919 uh the ancestrally reconstructed copies 847 00:33:15,029 --> 00:33:11,840 of the beta philip alpha from all the 848 00:33:17,990 --> 00:33:15,039 p-loop ntps and graphed it onto a very 849 00:33:20,310 --> 00:33:18,000 rudimentary scaffold that mimics the 850 00:33:21,909 --> 00:33:20,320 core domain of the p-loop ntps that is 851 00:33:23,509 --> 00:33:21,919 the three-layered alpha beta alpha 852 00:33:25,269 --> 00:33:23,519 sandwich architecture 853 00:33:27,590 --> 00:33:25,279 but it does not have any of the other 854 00:33:29,350 --> 00:33:27,600 active site residues that modern day p 855 00:33:31,750 --> 00:33:29,360 loop and tps have 856 00:33:34,950 --> 00:33:31,760 and then we see if these prototypes the 857 00:33:35,909 --> 00:33:34,960 simple prototypes can function 858 00:33:38,070 --> 00:33:35,919 so we 859 00:33:40,950 --> 00:33:38,080 interestingly we did see that the spiel 860 00:33:43,269 --> 00:33:40,960 of prototypes are bound to atp as shown 861 00:33:45,509 --> 00:33:43,279 here in an spr method that was just 862 00:33:48,070 --> 00:33:45,519 discussed by the previous speaker 863 00:33:50,389 --> 00:33:48,080 but what was more interesting was that 864 00:33:52,470 --> 00:33:50,399 these fragments of these proto proteins 865 00:33:54,789 --> 00:33:52,480 also bound single-stranded dna as you 866 00:33:56,470 --> 00:33:54,799 can see here by higher signal 867 00:33:58,710 --> 00:33:56,480 relative to the double-stranded dna in 868 00:33:59,990 --> 00:33:58,720 analyzer-based method 869 00:34:01,909 --> 00:34:00,000 so it was 870 00:34:03,909 --> 00:34:01,919 it was great that these prototypes by 871 00:34:05,669 --> 00:34:03,919 bound both ntps and single-standard dna 872 00:34:07,509 --> 00:34:05,679 and i must add that they bind to both 873 00:34:09,190 --> 00:34:07,519 these ligands via the same phosphate 874 00:34:11,669 --> 00:34:09,200 binding loop 875 00:34:14,629 --> 00:34:11,679 we now wanted to see if we can extend 876 00:34:16,950 --> 00:34:14,639 from the realm of just ligand binding 877 00:34:18,389 --> 00:34:16,960 and ask if these prototypes or these 878 00:34:20,069 --> 00:34:18,399 protoproteins 879 00:34:23,349 --> 00:34:20,079 does have any function which is of 880 00:34:24,950 --> 00:34:23,359 greater evolutionary relevance so we 881 00:34:26,470 --> 00:34:24,960 asked if these p look prototypes can 882 00:34:29,270 --> 00:34:26,480 remodel nucleic acid or more 883 00:34:31,510 --> 00:34:29,280 specifically if they can unwind dna 884 00:34:33,430 --> 00:34:31,520 given that they bind preferably to 885 00:34:35,430 --> 00:34:33,440 single-stranded dna can they shift the 886 00:34:38,069 --> 00:34:35,440 equilibrium from a double stranded bound 887 00:34:40,310 --> 00:34:38,079 form to a single standard bound form 888 00:34:42,230 --> 00:34:40,320 and we were basically guided by the 889 00:34:45,109 --> 00:34:42,240 observation that many of the luca p loop 890 00:34:47,109 --> 00:34:45,119 ntps were helicases recombinases 891 00:34:48,950 --> 00:34:47,119 and translocasis 892 00:34:51,109 --> 00:34:48,960 and it goes without saying that in the 893 00:34:53,510 --> 00:34:51,119 piluka world composed of nucleic acids 894 00:34:54,629 --> 00:34:53,520 and and proteins the ability to remodel 895 00:34:56,550 --> 00:34:54,639 nucleic acid would have been an 896 00:34:58,390 --> 00:34:56,560 important function 897 00:34:59,990 --> 00:34:58,400 our second guiding observation was that 898 00:35:01,190 --> 00:35:00,000 although in most of the contemporary 899 00:35:02,630 --> 00:35:01,200 heli cases 900 00:35:04,470 --> 00:35:02,640 the phosphate binding loop does not 901 00:35:07,030 --> 00:35:04,480 interact with the single standard dna 902 00:35:09,430 --> 00:35:07,040 and yet in the pdb we were able to find 903 00:35:10,630 --> 00:35:09,440 certain instances or certain vestiges 904 00:35:12,550 --> 00:35:10,640 where we see 905 00:35:14,470 --> 00:35:12,560 that the phosphate binding loop does 906 00:35:15,829 --> 00:35:14,480 interact with the 907 00:35:17,109 --> 00:35:15,839 the phosphate backbone of the single 908 00:35:20,069 --> 00:35:17,119 standard dna 909 00:35:21,670 --> 00:35:20,079 especially in xpd helicases so given 910 00:35:23,990 --> 00:35:21,680 like with both these observations we 911 00:35:26,310 --> 00:35:24,000 then wanted to test if the 912 00:35:28,630 --> 00:35:26,320 prototypes can unwind dna 913 00:35:30,550 --> 00:35:28,640 and to test our hypothesis we used an 914 00:35:32,470 --> 00:35:30,560 assay known as the molecular beacon 915 00:35:34,150 --> 00:35:32,480 assay where you have a double-stranded 916 00:35:35,670 --> 00:35:34,160 piece of dna 917 00:35:37,349 --> 00:35:35,680 the top strand of pitch has a 918 00:35:39,430 --> 00:35:37,359 fluorophore and a venture and opposite 919 00:35:41,750 --> 00:35:39,440 ends and if the dna strands were to be 920 00:35:43,750 --> 00:35:41,760 unwound it can form a beacon-like 921 00:35:45,109 --> 00:35:43,760 structure due to self-complementary ends 922 00:35:46,950 --> 00:35:45,119 and resulting in the loss of 923 00:35:49,750 --> 00:35:46,960 fluorescence 924 00:35:51,829 --> 00:35:49,760 and indeed the intact prototype as soon 925 00:35:53,349 --> 00:35:51,839 as you add it to a fluorescent dna as 926 00:35:56,390 --> 00:35:53,359 you can see here you see a drop in 927 00:35:57,990 --> 00:35:56,400 fluorescence that reaches the baseline 928 00:36:00,230 --> 00:35:58,000 in a two two-hour time scale and the 929 00:36:02,069 --> 00:36:00,240 baseline over here basically represents 930 00:36:03,430 --> 00:36:02,079 a completely twin state 931 00:36:05,829 --> 00:36:03,440 so it was great that the impact 932 00:36:07,510 --> 00:36:05,839 prototype mediates dna unwinding or 933 00:36:09,190 --> 00:36:07,520 stand separation 934 00:36:11,589 --> 00:36:09,200 but we wanted to see 935 00:36:13,270 --> 00:36:11,599 how small can we go while still 936 00:36:14,870 --> 00:36:13,280 retaining the function 937 00:36:17,109 --> 00:36:14,880 so here 938 00:36:19,270 --> 00:36:17,119 by a series of truncation and 939 00:36:20,950 --> 00:36:19,280 circular permutation we narrowed down or 940 00:36:23,109 --> 00:36:20,960 to shorten down the intact prototype 941 00:36:25,349 --> 00:36:23,119 from 110 amino acid to something which 942 00:36:27,030 --> 00:36:25,359 is less than 40 amino acid and this 943 00:36:28,950 --> 00:36:27,040 construct which we call as the n alpha 944 00:36:32,870 --> 00:36:28,960 beta alpha construct just has an alpha 945 00:36:35,589 --> 00:36:32,880 helix and the beta pulo pulpa motif 946 00:36:38,150 --> 00:36:35,599 so this an alpha beta alpha construct 947 00:36:39,990 --> 00:36:38,160 not only does it unwind dna it is the 948 00:36:42,230 --> 00:36:40,000 most efficient at dna unwinding as you 949 00:36:44,150 --> 00:36:42,240 can see by sharp dropping fluorescence 950 00:36:45,829 --> 00:36:44,160 indicating strand separation by the 951 00:36:47,109 --> 00:36:45,839 molecular beacon assay and it reaches 952 00:36:50,069 --> 00:36:47,119 the baseline 953 00:36:52,790 --> 00:36:50,079 so overall it suggests that the the 954 00:36:55,109 --> 00:36:52,800 basic beta palpa motif demonstrates 955 00:36:57,109 --> 00:36:55,119 significant structure plasticity in that 956 00:36:58,950 --> 00:36:57,119 you can put it in a variety of reduced 957 00:37:01,349 --> 00:36:58,960 complexity structural complexity 958 00:37:02,790 --> 00:37:01,359 scaffolds and it still not only retains 959 00:37:04,870 --> 00:37:02,800 the function but it can also show 960 00:37:06,950 --> 00:37:04,880 enhanced activity and this structural 961 00:37:09,910 --> 00:37:06,960 plasticity would have been crucial for 962 00:37:11,750 --> 00:37:09,920 primordial peptides to function 963 00:37:13,589 --> 00:37:11,760 so overall the helicase-like activity 964 00:37:15,109 --> 00:37:13,599 that i just showed you provides a 965 00:37:18,230 --> 00:37:15,119 plausible solution to the rna 966 00:37:20,550 --> 00:37:18,240 replication problem which is once the 967 00:37:22,069 --> 00:37:20,560 rna molecules have been replicated and 968 00:37:24,630 --> 00:37:22,079 once they have formed a double standard 969 00:37:26,470 --> 00:37:24,640 structure for them to unwind or for them 970 00:37:28,069 --> 00:37:26,480 to open up it requires an unwinding 971 00:37:29,990 --> 00:37:28,079 polypeptide for the second round of 972 00:37:32,470 --> 00:37:30,000 replication to occur and this is where 973 00:37:34,550 --> 00:37:32,480 the p-loop prototypes of proto-peptides 974 00:37:37,750 --> 00:37:34,560 like the ones which i've shown you would 975 00:37:40,870 --> 00:37:37,760 have provided a solution to this problem 976 00:37:43,430 --> 00:37:40,880 okay so i mentioned earlier uh that 977 00:37:45,349 --> 00:37:43,440 these fragments bind to ntps and single 978 00:37:47,910 --> 00:37:45,359 standard dna both by the phosphate 979 00:37:49,510 --> 00:37:47,920 binding loop if that is the case can we 980 00:37:51,510 --> 00:37:49,520 have some kind of an exchange between 981 00:37:54,150 --> 00:37:51,520 the two ligands 982 00:37:55,829 --> 00:37:54,160 and it turns out it we can so what you 983 00:37:57,430 --> 00:37:55,839 see over here is the same molecule as we 984 00:37:59,430 --> 00:37:57,440 can say where you see a decrease in 985 00:38:01,190 --> 00:37:59,440 fluorescence upon addition of protein 986 00:38:02,550 --> 00:38:01,200 and at this point 987 00:38:04,710 --> 00:38:02,560 when the dna molecules have been 988 00:38:07,670 --> 00:38:04,720 completely unbound if we add ligands 989 00:38:08,950 --> 00:38:07,680 like gtp and atp you see that the bound 990 00:38:11,030 --> 00:38:08,960 proteins release 991 00:38:13,349 --> 00:38:11,040 allowing the dna to revert back to its 992 00:38:15,109 --> 00:38:13,359 initial unwound state as you can see by 993 00:38:17,270 --> 00:38:15,119 increasing fluorescence 994 00:38:19,349 --> 00:38:17,280 therefore resembling some kind of a 995 00:38:21,109 --> 00:38:19,359 rudimentary helical cycle 996 00:38:22,710 --> 00:38:21,119 whereas modern day helicases what they 997 00:38:25,109 --> 00:38:22,720 do is what they would use the energy of 998 00:38:26,870 --> 00:38:25,119 atp hydrolysis unwind the dna and 999 00:38:28,870 --> 00:38:26,880 release from the dna so we see that 1000 00:38:32,150 --> 00:38:28,880 these prototypes also have some helicase 1001 00:38:33,750 --> 00:38:32,160 like activity or helicals like cycles 1002 00:38:36,150 --> 00:38:33,760 but what was the most interesting part 1003 00:38:38,470 --> 00:38:36,160 which i'm going to talk now is that 1004 00:38:39,990 --> 00:38:38,480 inorganic polyphosphates that is 1005 00:38:41,750 --> 00:38:40,000 long-chain polyphosphates and 1006 00:38:43,430 --> 00:38:41,760 hexamethylphosphate which is cyclic form 1007 00:38:45,990 --> 00:38:43,440 of phosphate was the most efficient in 1008 00:38:48,230 --> 00:38:46,000 releasing the proteins from the dna as 1009 00:38:50,870 --> 00:38:48,240 you can see here this 5.6 micromolar of 1010 00:38:53,589 --> 00:38:50,880 hexameter phosphate can release 1011 00:38:54,550 --> 00:38:53,599 almost 50 of the proteins bound to the 1012 00:38:57,510 --> 00:38:54,560 dna 1013 00:38:59,109 --> 00:38:57,520 whereas atp requires three point 1014 00:39:00,550 --> 00:38:59,119 almost three millimolar concentration to 1015 00:39:03,910 --> 00:39:00,560 have the same effect 1016 00:39:05,829 --> 00:39:03,920 so that these primordial proteins bind 1017 00:39:07,750 --> 00:39:05,839 favorably to inorganic polyphosphate 1018 00:39:09,589 --> 00:39:07,760 which i have also been proposed to be 1019 00:39:11,109 --> 00:39:09,599 the ancient precursor 1020 00:39:13,510 --> 00:39:11,119 of ntps 1021 00:39:15,829 --> 00:39:13,520 we can say that the mode of action of 1022 00:39:18,390 --> 00:39:15,839 these prototypes is quite tailored to 1023 00:39:20,870 --> 00:39:18,400 the needs of the primordial world 1024 00:39:23,670 --> 00:39:20,880 so basically now you you can ask me that 1025 00:39:25,990 --> 00:39:23,680 how can such a short fragment 1026 00:39:27,990 --> 00:39:26,000 demonstrate such complex function and i 1027 00:39:30,310 --> 00:39:28,000 think and and we know that the key to 1028 00:39:32,470 --> 00:39:30,320 function is that the ability of these 1029 00:39:33,670 --> 00:39:32,480 short proteins to oligomerize or to 1030 00:39:36,390 --> 00:39:33,680 self-assemble 1031 00:39:39,510 --> 00:39:36,400 by native mass spec we have shown that 1032 00:39:42,630 --> 00:39:39,520 the n-alpha beta alpha peptide can form 1033 00:39:44,950 --> 00:39:42,640 large oligomers up to 30 more complexes 1034 00:39:46,550 --> 00:39:44,960 and this is the key for it to function 1035 00:39:48,630 --> 00:39:46,560 otherwise a short peptide cannot 1036 00:39:50,069 --> 00:39:48,640 function by itself in a solvent exposed 1037 00:39:51,589 --> 00:39:50,079 group 1038 00:39:52,950 --> 00:39:51,599 so to summarize 1039 00:39:54,630 --> 00:39:52,960 uh the ancient p loop was a 1040 00:39:56,230 --> 00:39:54,640 multifunctional p loop which that one 1041 00:39:59,510 --> 00:39:56,240 which had to do multiple functions such 1042 00:40:00,630 --> 00:39:59,520 as dna binding single uh ntp binding dna 1043 00:40:02,630 --> 00:40:00,640 unwinding 1044 00:40:05,030 --> 00:40:02,640 and such multi-functional prototypes 1045 00:40:07,109 --> 00:40:05,040 then underwent self-assembly duplication 1046 00:40:09,109 --> 00:40:07,119 and fusion to give rise to modern day 1047 00:40:12,150 --> 00:40:09,119 proteins which had specialized domains 1048 00:40:13,829 --> 00:40:12,160 that carry out specialized functions 1049 00:40:16,309 --> 00:40:13,839 and to end i would 1050 00:40:18,069 --> 00:40:16,319 say that these fragments these p-loop 1051 00:40:19,990 --> 00:40:18,079 prototypes 1052 00:40:21,910 --> 00:40:20,000 satisfy the basic postulates regarding 1053 00:40:24,150 --> 00:40:21,920 the emergence of earliest proteins in 1054 00:40:25,349 --> 00:40:24,160 that they are relatively short the 1055 00:40:27,430 --> 00:40:25,359 compose of 1056 00:40:30,309 --> 00:40:27,440 almost a minimal abiotic amino acid 1057 00:40:31,910 --> 00:40:30,319 alphabet these prototypes have a lysine 1058 00:40:33,270 --> 00:40:31,920 and i is tagged but we know that if you 1059 00:40:35,589 --> 00:40:33,280 remove the haystack and even if you 1060 00:40:37,270 --> 00:40:35,599 mutate the lysine with a glycine they 1061 00:40:39,829 --> 00:40:37,280 still retain function and they are 1062 00:40:42,069 --> 00:40:39,839 incredibly tolerant to mutations 1063 00:40:44,550 --> 00:40:42,079 and the last type the last postulate is 1064 00:40:47,430 --> 00:40:44,560 that they tend to self-assemble which 1065 00:40:49,109 --> 00:40:47,440 allows them to form a larger structural 1066 00:40:51,990 --> 00:40:49,119 you know configuration 1067 00:40:54,390 --> 00:40:52,000 that is crucial for function 1068 00:40:56,390 --> 00:40:54,400 so to conclude i would say that the p 1069 00:40:58,950 --> 00:40:56,400 loop prototype despite the simplicity 1070 00:41:00,550 --> 00:40:58,960 they relate to contemporary p loop n tps 1071 00:41:03,510 --> 00:41:00,560 in terms of their sequence structure and 1072 00:41:05,109 --> 00:41:03,520 function and that they serve as starting 1073 00:41:07,910 --> 00:41:05,119 points or evolutionary starting points 1074 00:41:09,990 --> 00:41:07,920 for enzymes with more complex activity 1075 00:41:12,309 --> 00:41:10,000 and it is only app that i end the 1076 00:41:14,390 --> 00:41:12,319 presentation by this quote from darwin 1077 00:41:16,309 --> 00:41:14,400 which was also one of dhani's favorite 1078 00:41:17,270 --> 00:41:16,319 quote is that from so simple the 1079 00:41:18,950 --> 00:41:17,280 beginning 1080 00:41:20,550 --> 00:41:18,960 endless forms most beautiful and most 1081 00:41:21,829 --> 00:41:20,560 wonderful have been and are being 1082 00:41:23,430 --> 00:41:21,839 involved 1083 00:41:25,910 --> 00:41:23,440 and 1084 00:41:26,790 --> 00:41:25,920 i would like to thank the people from my 1085 00:41:28,630 --> 00:41:26,800 lab 1086 00:41:30,550 --> 00:41:28,640 i would like to thank sarah fleischmann 1087 00:41:32,470 --> 00:41:30,560 who is my new supervisor 1088 00:41:33,990 --> 00:41:32,480 stephen and the organized organizing 1089 00:41:35,109 --> 00:41:34,000 committee for giving me this opportunity 1090 00:41:36,630 --> 00:41:35,119 once again 1091 00:41:38,790 --> 00:41:36,640 and the volkswagen foundation and the 1092 00:41:40,309 --> 00:41:38,800 weizmann institute for the generous 1093 00:41:48,069 --> 00:41:40,319 funding 1094 00:41:51,589 --> 00:41:50,150 you pratik for stimulating talk and in 1095 00:41:53,430 --> 00:41:51,599 the interest of time we're going to 1096 00:41:55,030 --> 00:41:53,440 suppress questions but we do have extra 1097 00:41:57,430 --> 00:41:55,040 time at the end for questions i'm sure 1098 00:42:00,069 --> 00:41:57,440 there will be many i'm going to 1099 00:42:02,710 --> 00:42:00,079 go ahead and introduce our next speaker 1100 00:42:05,670 --> 00:42:02,720 who is claudia alvarez who is a 1101 00:42:06,790 --> 00:42:05,680 postdoctoral scholar in the laboratory 1102 00:42:12,069 --> 00:42:06,800 of 1103 00:42:16,550 --> 00:42:13,829 thank you 1104 00:42:17,829 --> 00:42:16,560 i'm going to talk about protein fold 1105 00:42:18,630 --> 00:42:17,839 evolution 1106 00:42:19,589 --> 00:42:18,640 or 1107 00:42:24,309 --> 00:42:19,599 how 1108 00:42:29,349 --> 00:42:26,870 so in this work we wanted to understand 1109 00:42:31,670 --> 00:42:29,359 the evolutionary mechanisms that led to 1110 00:42:33,750 --> 00:42:31,680 the diversity of protein falls in 1111 00:42:36,950 --> 00:42:33,760 contemporary biology 1112 00:42:40,470 --> 00:42:36,960 so for example in a human cell or in a 1113 00:42:41,750 --> 00:42:40,480 human proteome there are around 20 000 1114 00:42:47,510 --> 00:42:41,760 proteins 1115 00:42:50,309 --> 00:42:47,520 000 unique units 1116 00:42:53,030 --> 00:42:50,319 so 1000 is a very small number when 1117 00:42:57,109 --> 00:42:53,040 compared to the total number of proteins 1118 00:42:58,790 --> 00:42:57,119 that are present in a single human cell 1119 00:43:00,390 --> 00:42:58,800 and i think it's also a very small 1120 00:43:03,510 --> 00:43:00,400 number when you think 1121 00:43:07,670 --> 00:43:03,520 that these are the product of 3.8 1122 00:43:12,870 --> 00:43:10,309 but we can see the same question with a 1123 00:43:14,069 --> 00:43:12,880 different perspective 1124 00:43:16,069 --> 00:43:14,079 so the 1125 00:43:18,430 --> 00:43:16,079 emergence of 1126 00:43:20,630 --> 00:43:18,440 folding competent sequences is a 1127 00:43:23,589 --> 00:43:20,640 multi-layer problem 1128 00:43:26,710 --> 00:43:23,599 so first we have the problem of the 1129 00:43:28,950 --> 00:43:26,720 amino acid sequences being very 1130 00:43:31,910 --> 00:43:28,960 there are many combinations 1131 00:43:34,630 --> 00:43:31,920 so for the extant genetic code there are 1132 00:43:36,950 --> 00:43:34,640 far more possible amino acid sequences 1133 00:43:38,309 --> 00:43:36,960 than there are stars in the universe 1134 00:43:41,270 --> 00:43:38,319 actually for 1135 00:43:43,990 --> 00:43:41,280 a sequence of 100 residues there are 1136 00:43:46,950 --> 00:43:44,000 more combinations that are possible than 1137 00:43:48,309 --> 00:43:46,960 atoms in the universe so 1138 00:43:51,190 --> 00:43:48,319 um 1139 00:43:54,069 --> 00:43:51,200 it's not it's unlikely that all of these 1140 00:43:55,910 --> 00:43:54,079 sequences can be sampled 1141 00:43:58,069 --> 00:43:55,920 the next problem is that not all 1142 00:44:00,950 --> 00:43:58,079 combinations will result in a stable 1143 00:44:04,309 --> 00:44:00,960 fold and then when you finally find a 1144 00:44:05,990 --> 00:44:04,319 combination that can fold stably 1145 00:44:07,670 --> 00:44:06,000 it's not 1146 00:44:08,470 --> 00:44:07,680 like you can 1147 00:44:11,670 --> 00:44:08,480 move 1148 00:44:13,270 --> 00:44:11,680 from fold to fall just by simply 1149 00:44:14,870 --> 00:44:13,280 modifying 1150 00:44:16,630 --> 00:44:14,880 the sequence 1151 00:44:20,309 --> 00:44:16,640 step by step 1152 00:44:23,349 --> 00:44:20,319 so there are not many examples of 1153 00:44:25,430 --> 00:44:23,359 sequences that can transition from one 1154 00:44:28,950 --> 00:44:25,440 fold to the other 1155 00:44:32,390 --> 00:44:28,960 but what we do find is many examples of 1156 00:44:35,030 --> 00:44:32,400 sequences that share similarity between 1157 00:44:37,910 --> 00:44:35,040 different folds the similarity in this 1158 00:44:40,950 --> 00:44:37,920 case is not overall in the entire 1159 00:44:42,950 --> 00:44:40,960 sequence but just a small fragment 1160 00:44:45,990 --> 00:44:42,960 these are called crossfall sequence 1161 00:44:49,109 --> 00:44:46,000 similarities and they suggest fault 1162 00:44:51,190 --> 00:44:49,119 evolution so once you find this 1163 00:44:53,510 --> 00:44:51,200 crossfall sequence similarities you can 1164 00:44:55,589 --> 00:44:53,520 assume there's a evolutionary history 1165 00:44:58,950 --> 00:44:55,599 that is shared but you still don't 1166 00:45:02,069 --> 00:44:58,960 understand how these came to be and we 1167 00:45:04,790 --> 00:45:02,079 wanted to know the step-by-step process 1168 00:45:07,750 --> 00:45:04,800 of how this happened so we started 1169 00:45:10,790 --> 00:45:07,760 looking at examples we we thought do we 1170 00:45:14,390 --> 00:45:10,800 really know of a case of false evolution 1171 00:45:16,670 --> 00:45:14,400 that we completely understand 1172 00:45:18,390 --> 00:45:16,680 and it turns out that there is a 1173 00:45:20,550 --> 00:45:18,400 paradigmatic case 1174 00:45:23,030 --> 00:45:20,560 that is circular permutation 1175 00:45:26,150 --> 00:45:23,040 so circular permutation is a 1176 00:45:27,829 --> 00:45:26,160 relationship between two proteins 1177 00:45:29,430 --> 00:45:27,839 or two topologies 1178 00:45:31,670 --> 00:45:29,440 that have a very similar 1179 00:45:33,430 --> 00:45:31,680 three-dimensional structure but the 1180 00:45:35,510 --> 00:45:33,440 secondary structural elements are 1181 00:45:39,829 --> 00:45:35,520 rearranged 1182 00:45:42,870 --> 00:45:39,839 so how do you get from fold a to fold b 1183 00:45:45,910 --> 00:45:42,880 simply by circularizing the 1184 00:45:48,230 --> 00:45:45,920 fold a and then you can cleave at 1185 00:45:49,829 --> 00:45:48,240 whichever point in the 1186 00:45:52,710 --> 00:45:49,839 protein structure 1187 00:45:55,510 --> 00:45:52,720 you will get the circular permutant of 1188 00:45:57,510 --> 00:45:55,520 the fold a 1189 00:45:59,829 --> 00:45:57,520 but this is not what happens in 1190 00:46:02,470 --> 00:45:59,839 evolution so there are many examples of 1191 00:46:04,230 --> 00:46:02,480 circular permutation but the mechanism 1192 00:46:06,470 --> 00:46:04,240 is not this 1193 00:46:10,470 --> 00:46:06,480 what happens in evolution is that you 1194 00:46:13,030 --> 00:46:10,480 get one gene that is duplicated in line 1195 00:46:16,390 --> 00:46:13,040 so usually a duplication of a domain 1196 00:46:18,390 --> 00:46:16,400 gives a repeat of the same fold so you 1197 00:46:21,270 --> 00:46:18,400 have the same fold twice in a single 1198 00:46:23,510 --> 00:46:21,280 protein but when you have circular 1199 00:46:26,550 --> 00:46:23,520 permutation this is not what happens the 1200 00:46:27,750 --> 00:46:26,560 duplication opens a new folding 1201 00:46:30,230 --> 00:46:27,760 landscape 1202 00:46:33,109 --> 00:46:30,240 for this protein and then a new fold 1203 00:46:35,030 --> 00:46:33,119 emerges and this new fold will have some 1204 00:46:37,109 --> 00:46:35,040 secondary structural elements from the 1205 00:46:39,109 --> 00:46:37,119 first copy of the repeat and some 1206 00:46:40,710 --> 00:46:39,119 secondary elements from the second copy 1207 00:46:43,109 --> 00:46:40,720 of the repeat 1208 00:46:46,069 --> 00:46:43,119 so the last step in the maturation of 1209 00:46:48,150 --> 00:46:46,079 the circular permutant is the loss of 1210 00:46:50,309 --> 00:46:48,160 the terminal segment 1211 00:46:52,309 --> 00:46:50,319 and that way you have a daughter fold 1212 00:46:53,910 --> 00:46:52,319 that is very similar to the ancestral 1213 00:46:56,790 --> 00:46:53,920 fold but the secondary structural 1214 00:46:59,030 --> 00:46:56,800 elements are rearranged 1215 00:47:01,190 --> 00:46:59,040 so what do we learn from the study of 1216 00:47:03,750 --> 00:47:01,200 circular permutation well we learned 1217 00:47:04,790 --> 00:47:03,760 that if we take many homologs to the 1218 00:47:07,750 --> 00:47:04,800 first 1219 00:47:09,349 --> 00:47:07,760 ancestral copy 1220 00:47:11,750 --> 00:47:09,359 sometimes we will have 1221 00:47:13,990 --> 00:47:11,760 sequences that are more similar to the 1222 00:47:16,230 --> 00:47:14,000 n-terminus of the daughter fold and 1223 00:47:18,710 --> 00:47:16,240 sometimes we will have 1224 00:47:22,230 --> 00:47:18,720 other sequences that are more similar to 1225 00:47:25,829 --> 00:47:22,240 the second health of the dodger fold 1226 00:47:28,150 --> 00:47:25,839 so if we sample a long enough uh list of 1227 00:47:30,790 --> 00:47:28,160 sequences and then we align them we will 1228 00:47:33,030 --> 00:47:30,800 get a pattern of cross-fold sequence 1229 00:47:35,430 --> 00:47:33,040 similarities that will look 1230 00:47:37,589 --> 00:47:35,440 like these 1231 00:47:40,069 --> 00:47:37,599 and this is what we want to look for so 1232 00:47:42,870 --> 00:47:40,079 now we have a strategy we know what we 1233 00:47:44,549 --> 00:47:42,880 want to look for and we can interpret 1234 00:47:45,589 --> 00:47:44,559 this pattern 1235 00:47:48,390 --> 00:47:45,599 next 1236 00:47:51,430 --> 00:47:48,400 where do we start and we started by one 1237 00:47:53,670 --> 00:47:51,440 of the ribosomal proteins of course 1238 00:47:55,990 --> 00:47:53,680 this is universal ribosomal protein two 1239 00:47:57,589 --> 00:47:56,000 and this is very interesting 1240 00:47:59,510 --> 00:47:57,599 this is a very interesting protein 1241 00:48:01,990 --> 00:47:59,520 because it's one of the few universal 1242 00:48:04,549 --> 00:48:02,000 ribosomal proteins that has more than 1243 00:48:07,109 --> 00:48:04,559 one domain it's a multi-domain protein 1244 00:48:10,470 --> 00:48:07,119 the two domains in ul2 are distinct 1245 00:48:12,549 --> 00:48:10,480 these are called sh3 and oe 1246 00:48:14,870 --> 00:48:12,559 and these two folds are present 1247 00:48:17,349 --> 00:48:14,880 everywhere in the translation machinery 1248 00:48:20,069 --> 00:48:17,359 so from other ribosomal proteins to 1249 00:48:23,430 --> 00:48:20,079 amino acid rna synthetases and 1250 00:48:25,829 --> 00:48:23,440 initiation and elongation factors 1251 00:48:28,309 --> 00:48:25,839 so we took ul2 1252 00:48:30,950 --> 00:48:28,319 build multiple sequence alignments 1253 00:48:32,069 --> 00:48:30,960 search the evolutionary classification 1254 00:48:33,430 --> 00:48:32,079 of domains 1255 00:48:35,109 --> 00:48:33,440 for 1256 00:48:38,710 --> 00:48:35,119 sequence similarities and we were 1257 00:48:41,030 --> 00:48:38,720 looking for this characteristic pattern 1258 00:48:44,069 --> 00:48:41,040 so these are the results of our search 1259 00:48:46,470 --> 00:48:44,079 for crossfall sequence similarities in 1260 00:48:49,270 --> 00:48:46,480 the orange squares i'm showing you the 1261 00:48:51,430 --> 00:48:49,280 region where we would expect to see um 1262 00:48:53,990 --> 00:48:51,440 these crossfall sequence similarities 1263 00:48:55,349 --> 00:48:54,000 and you can see that i have divided the 1264 00:48:57,990 --> 00:48:55,359 results into 1265 00:48:59,270 --> 00:48:58,000 different panels 1266 00:49:00,790 --> 00:48:59,280 so the first 1267 00:49:03,670 --> 00:49:00,800 one shows the crossfull sequence 1268 00:49:06,390 --> 00:49:03,680 similarities between ob and sh3 and the 1269 00:49:07,990 --> 00:49:06,400 second one between ob and cradle loop 1270 00:49:09,910 --> 00:49:08,000 barrels 1271 00:49:11,430 --> 00:49:09,920 so these folds are 1272 00:49:13,670 --> 00:49:11,440 in the field of 1273 00:49:16,470 --> 00:49:13,680 protein fold evolution like rock stars 1274 00:49:18,710 --> 00:49:16,480 of the protein fold evolution this has 1275 00:49:19,430 --> 00:49:18,720 been this have been very well studied 1276 00:49:22,950 --> 00:49:19,440 and 1277 00:49:27,589 --> 00:49:25,270 have been very interesting 1278 00:49:30,630 --> 00:49:27,599 for this field 1279 00:49:33,750 --> 00:49:30,640 so for the first one sh3 and ob 1280 00:49:35,190 --> 00:49:33,760 i am showing you here in color 1281 00:49:37,270 --> 00:49:35,200 i have mapped 1282 00:49:41,190 --> 00:49:37,280 the region across fault sequence 1283 00:49:44,309 --> 00:49:41,200 similarity into 3d and into 1d 1284 00:49:47,589 --> 00:49:44,319 representations so on the left we have 1285 00:49:50,230 --> 00:49:47,599 one pair of sh3 and ob that share one 1286 00:49:52,150 --> 00:49:50,240 region in this region we can see that 1287 00:49:54,870 --> 00:49:52,160 the cross fall sequence similarity also 1288 00:49:57,430 --> 00:49:54,880 corresponds to a very similar structure 1289 00:49:59,109 --> 00:49:57,440 and then for the pair on the right 1290 00:50:01,589 --> 00:49:59,119 the region of crossfall sequence 1291 00:50:04,549 --> 00:50:01,599 similarity is also similar in structure 1292 00:50:08,549 --> 00:50:04,559 but there is a variation 1293 00:50:10,790 --> 00:50:08,559 there's a different turn between them 1294 00:50:12,630 --> 00:50:10,800 and now the next thing that we can do 1295 00:50:15,589 --> 00:50:12,640 because these two ob folds are 1296 00:50:18,150 --> 00:50:15,599 homologous we can align one to the other 1297 00:50:20,470 --> 00:50:18,160 and then bring their respective sh3 1298 00:50:23,349 --> 00:50:20,480 pairs to the alignment and when we do 1299 00:50:25,349 --> 00:50:23,359 that we find the characteristic pattern 1300 00:50:27,430 --> 00:50:25,359 that is similar to to the circular 1301 00:50:30,309 --> 00:50:27,440 permutation case 1302 00:50:32,710 --> 00:50:30,319 now if we study the case of obn cradle 1303 00:50:35,510 --> 00:50:32,720 loop barrel we have the same we have one 1304 00:50:37,430 --> 00:50:35,520 pair on the left that has one region of 1305 00:50:40,069 --> 00:50:37,440 cross-full sequence similarity and then 1306 00:50:42,630 --> 00:50:40,079 one pair on the right that shows a 1307 00:50:45,510 --> 00:50:42,640 different region so we did exactly the 1308 00:50:47,510 --> 00:50:45,520 same we aligned these two cradle loop 1309 00:50:49,829 --> 00:50:47,520 parallels together 1310 00:50:52,549 --> 00:50:49,839 and then brought the obs 1311 00:50:53,510 --> 00:50:52,559 and this is the pattern that we observe 1312 00:50:57,109 --> 00:50:53,520 so 1313 00:50:58,950 --> 00:50:57,119 what do we think happened 1314 00:51:01,510 --> 00:50:58,960 so what we think that 1315 00:51:06,390 --> 00:51:01,520 can be said about these relationships is 1316 00:51:09,190 --> 00:51:06,400 that possibly one ob fold ancestor 1317 00:51:12,549 --> 00:51:09,200 duplicated so usually you would get a 1318 00:51:14,790 --> 00:51:12,559 repeat of the ob fold but in this case 1319 00:51:17,589 --> 00:51:14,800 the repeat didn't give rise to our 1320 00:51:20,829 --> 00:51:17,599 repeat of the structure we got a new 1321 00:51:24,309 --> 00:51:20,839 fold with new hydrogen bonds and new 1322 00:51:27,349 --> 00:51:24,319 interactions so these fold matured and 1323 00:51:29,750 --> 00:51:27,359 transformed into what we know now as a 1324 00:51:32,390 --> 00:51:29,760 cradle loop barrel so in this crayola 1325 00:51:34,710 --> 00:51:32,400 barrel we have some secondary structural 1326 00:51:36,870 --> 00:51:34,720 elements and motifs that are very 1327 00:51:41,030 --> 00:51:36,880 similar to the ancestor 1328 00:51:43,430 --> 00:51:41,040 and some others that are new 1329 00:51:44,870 --> 00:51:43,440 what we can say is that relationships 1330 00:51:48,069 --> 00:51:44,880 between ob 1331 00:51:50,630 --> 00:51:48,079 sh3 and cradle loop barrels illustrate a 1332 00:51:53,190 --> 00:51:50,640 process that generates new fall 1333 00:51:55,589 --> 00:51:53,200 topologies from within 1334 00:51:58,390 --> 00:51:55,599 and we would say incessantly destroying 1335 00:51:59,990 --> 00:51:58,400 the old one incessantly creating a new 1336 00:52:03,910 --> 00:52:00,000 one so 1337 00:52:05,910 --> 00:52:03,920 here for example two sh3's form one of b 1338 00:52:08,790 --> 00:52:05,920 two o these form one crayola loop 1339 00:52:13,589 --> 00:52:11,349 so we called this process creative 1340 00:52:16,710 --> 00:52:13,599 destruction and this is the idea that 1341 00:52:19,750 --> 00:52:16,720 once you have one fold you can create 1342 00:52:22,870 --> 00:52:19,760 many from that one so maybe you don't 1343 00:52:25,750 --> 00:52:22,880 need to create many faults many times 1344 00:52:27,910 --> 00:52:25,760 you just need to create one and then you 1345 00:52:30,470 --> 00:52:27,920 can generate many 1346 00:52:33,349 --> 00:52:30,480 so creative destructions acts on the 1347 00:52:36,230 --> 00:52:33,359 level of domains depends on false 1348 00:52:38,790 --> 00:52:36,240 plasticity and resolves crossfall 1349 00:52:41,430 --> 00:52:38,800 similarities by a biologically plausible 1350 00:52:43,990 --> 00:52:41,440 mechanism suggesting that the universe 1351 00:52:46,549 --> 00:52:44,000 of protein folds is better described as 1352 00:52:48,790 --> 00:52:46,559 a network than as a tree 1353 00:52:49,829 --> 00:52:48,800 so i want to thank everyone in this 1354 00:52:53,190 --> 00:52:49,839 slide 1355 00:52:54,540 --> 00:52:53,200 and we have a preprint for more details 1356 00:53:09,190 --> 00:52:54,550 thank you 1357 00:53:14,790 --> 00:53:11,670 hey anthony brunetti here also from 1358 00:53:16,710 --> 00:53:14,800 georgia tech and i was uh wondering so 1359 00:53:19,190 --> 00:53:16,720 this so this 1360 00:53:22,390 --> 00:53:19,200 work that you showed is looking at these 1361 00:53:25,190 --> 00:53:22,400 incredibly ancient incredibly deeply 1362 00:53:27,270 --> 00:53:25,200 important uh really common uh folds and 1363 00:53:29,510 --> 00:53:27,280 things i was wondering 1364 00:53:31,349 --> 00:53:29,520 could another way of looking at this be 1365 00:53:32,470 --> 00:53:31,359 trying to find 1366 00:53:33,510 --> 00:53:32,480 uh 1367 00:53:36,470 --> 00:53:33,520 newer 1368 00:53:39,109 --> 00:53:36,480 proteins because there are new proteins 1369 00:53:40,870 --> 00:53:39,119 being generated especially in like giant 1370 00:53:42,470 --> 00:53:40,880 virus genomes and things like that and i 1371 00:53:44,230 --> 00:53:42,480 wonder if 1372 00:53:46,870 --> 00:53:44,240 even though those aren't extraordinarily 1373 00:53:48,470 --> 00:53:46,880 well established or i'm extraordinarily 1374 00:53:50,470 --> 00:53:48,480 well understood i wonder if that might 1375 00:53:53,349 --> 00:53:50,480 be a place to see 1376 00:53:56,950 --> 00:53:53,359 rapid rates of this happening if this if 1377 00:53:58,870 --> 00:53:56,960 this is uh going on there yeah so this 1378 00:54:02,630 --> 00:53:58,880 would be a process that can auto 1379 00:54:05,430 --> 00:54:02,640 propagate and this example here 1380 00:54:06,710 --> 00:54:05,440 is actually of a protein that is present 1381 00:54:07,990 --> 00:54:06,720 in 1382 00:54:11,349 --> 00:54:08,000 humans 1383 00:54:14,069 --> 00:54:11,359 so this pdb code comes from an 1384 00:54:16,470 --> 00:54:14,079 sequence that very recently suffered 1385 00:54:22,590 --> 00:54:16,480 this creative destruction 1386 00:54:22,600 --> 00:54:37,030 [Applause] 1387 00:54:41,589 --> 00:54:39,670 wonderful well thank you for joining us 1388 00:54:44,309 --> 00:54:41,599 for this session 1389 00:54:46,870 --> 00:54:44,319 um i'm going to join i'm going to 1390 00:54:48,309 --> 00:54:46,880 begin by acknowledging the individuals 1391 00:54:50,150 --> 00:54:48,319 and organizations that have made this 1392 00:54:51,349 --> 00:54:50,160 research possible 1393 00:54:53,990 --> 00:54:51,359 i'm going to be talking about the work 1394 00:54:55,670 --> 00:54:54,000 of three of my uh students to our 1395 00:54:57,190 --> 00:54:55,680 graduate students philip toe and haley 1396 00:54:59,910 --> 00:54:57,200 moran and one is a very talented 1397 00:55:02,870 --> 00:54:59,920 undergraduate atarva bhagwat and we've 1398 00:55:04,309 --> 00:55:02,880 received support from hfsp the nih and 1399 00:55:06,630 --> 00:55:04,319 the nsf 1400 00:55:08,710 --> 00:55:06,640 so let me start with a question 1401 00:55:11,349 --> 00:55:08,720 how do we know which proteins are the 1402 00:55:13,190 --> 00:55:11,359 most ancient well we can do our best to 1403 00:55:15,349 --> 00:55:13,200 try to answer this difficult and tangled 1404 00:55:17,349 --> 00:55:15,359 up question one way is we can sort of 1405 00:55:19,030 --> 00:55:17,359 infer that proteins are probably ancient 1406 00:55:21,349 --> 00:55:19,040 if they're extremely important and if we 1407 00:55:23,670 --> 00:55:21,359 can infer their presence in some 1408 00:55:25,190 --> 00:55:23,680 primordial organisms such as luca but 1409 00:55:27,190 --> 00:55:25,200 the problem of this is that of course a 1410 00:55:29,109 --> 00:55:27,200 lot of protein evolution occurred before 1411 00:55:30,870 --> 00:55:29,119 luca especially of the most fundamental 1412 00:55:32,230 --> 00:55:30,880 domains such as the ones that claudia 1413 00:55:34,069 --> 00:55:32,240 was speaking about 1414 00:55:35,589 --> 00:55:34,079 we can also try to address this question 1415 00:55:38,309 --> 00:55:35,599 by looking at the phylogeny or the 1416 00:55:40,470 --> 00:55:38,319 distribution by creating the trees but 1417 00:55:42,309 --> 00:55:40,480 the problem is that this trees can de 1418 00:55:44,549 --> 00:55:42,319 expansion events whereby rapid 1419 00:55:47,430 --> 00:55:44,559 diversification of a certain fold class 1420 00:55:48,789 --> 00:55:47,440 can decouple the distribution and the 1421 00:55:51,030 --> 00:55:48,799 actual 1422 00:55:51,990 --> 00:55:51,040 order of incorporation into the protein 1423 00:55:53,750 --> 00:55:52,000 universe 1424 00:55:56,710 --> 00:55:53,760 so i'll sort of repeat the question can 1425 00:55:59,589 --> 00:55:56,720 any experimentally observable property 1426 00:56:02,230 --> 00:55:59,599 of a protein speak to the antiquity of 1427 00:56:04,230 --> 00:56:02,240 its provenance and we think the answer 1428 00:56:06,630 --> 00:56:04,240 to that question is yes and it's it's 1429 00:56:08,470 --> 00:56:06,640 refoldability 1430 00:56:10,390 --> 00:56:08,480 so um my background is as a 1431 00:56:12,069 --> 00:56:10,400 protein-folding biophysicist and so we 1432 00:56:14,309 --> 00:56:12,079 think a lot about this remarkable 1433 00:56:16,549 --> 00:56:14,319 property of proteins whereby they can 1434 00:56:18,710 --> 00:56:16,559 spontaneously self-assemble into complex 1435 00:56:20,230 --> 00:56:18,720 three-dimensional architectures and this 1436 00:56:22,390 --> 00:56:20,240 is a property that is frequently 1437 00:56:23,990 --> 00:56:22,400 explored by biophysicists through 1438 00:56:25,510 --> 00:56:24,000 experiments in which proteins are 1439 00:56:27,349 --> 00:56:25,520 unfolded either by increasing 1440 00:56:30,150 --> 00:56:27,359 temperature or adding chaotropes like 1441 00:56:32,309 --> 00:56:30,160 urea or guanidine and then removing that 1442 00:56:34,870 --> 00:56:32,319 condition to return to the physiological 1443 00:56:36,950 --> 00:56:34,880 conditions under which some particular 1444 00:56:39,430 --> 00:56:36,960 proteins have this capacity to return to 1445 00:56:42,390 --> 00:56:39,440 their native fold unassisted so we call 1446 00:56:44,390 --> 00:56:42,400 this property of a protein refoldability 1447 00:56:46,309 --> 00:56:44,400 now of course the physical basis by 1448 00:56:48,470 --> 00:56:46,319 which this is generally explained is by 1449 00:56:50,470 --> 00:56:48,480 positing this so-called free energy 1450 00:56:53,349 --> 00:56:50,480 landscape in which we hypothesize that 1451 00:56:55,190 --> 00:56:53,359 native states reflect um thermodynamic 1452 00:56:56,870 --> 00:56:55,200 minima so that is the conformation that 1453 00:56:59,510 --> 00:56:56,880 lowers the gibbs free energy of the 1454 00:57:01,670 --> 00:56:59,520 system and if you um can posit that then 1455 00:57:03,190 --> 00:57:01,680 it's easy to imagine why you could do 1456 00:57:05,190 --> 00:57:03,200 whatever you want to this protein and 1457 00:57:07,510 --> 00:57:05,200 it's going to find safe passage back 1458 00:57:09,030 --> 00:57:07,520 home to its native fold because that's 1459 00:57:10,549 --> 00:57:09,040 basically what thermodynamics says it 1460 00:57:11,990 --> 00:57:10,559 has to do 1461 00:57:13,190 --> 00:57:12,000 but it's worth pointing out that even 1462 00:57:15,190 --> 00:57:13,200 though this is a property that we 1463 00:57:18,150 --> 00:57:15,200 frequently study it's by no means 1464 00:57:19,190 --> 00:57:18,160 universal it's basically a property of 1465 00:57:20,870 --> 00:57:19,200 small 1466 00:57:23,030 --> 00:57:20,880 single domain proteins the type that 1467 00:57:24,309 --> 00:57:23,040 biophysicists like to study but there 1468 00:57:26,069 --> 00:57:24,319 are lots of proteins that are 1469 00:57:27,750 --> 00:57:26,079 extraordinarily important for biology 1470 00:57:29,349 --> 00:57:27,760 that are complicated that involve lots 1471 00:57:31,510 --> 00:57:29,359 of moving parts that are embedded in 1472 00:57:33,349 --> 00:57:31,520 mechanisms and they use all sorts of 1473 00:57:35,670 --> 00:57:33,359 other machineries like chaperones in 1474 00:57:37,990 --> 00:57:35,680 order to be able to assemble so our 1475 00:57:40,710 --> 00:57:38,000 hypothesis is that by looking at what 1476 00:57:42,950 --> 00:57:40,720 classes of proteins are capable of 1477 00:57:45,829 --> 00:57:42,960 refolding themselves autonomously we're 1478 00:57:48,069 --> 00:57:45,839 in essence asking a biophysical basis of 1479 00:57:50,069 --> 00:57:48,079 antiquity because we don't think that 1480 00:57:52,230 --> 00:57:50,079 during the origin of life a complex 1481 00:57:54,549 --> 00:57:52,240 chaperone network or quality control was 1482 00:57:57,109 --> 00:57:54,559 available in essence the only quality 1483 00:57:59,670 --> 00:57:57,119 control that was available for or 4.2 1484 00:58:01,349 --> 00:57:59,680 billion years ago was thermodynamics and 1485 00:58:04,390 --> 00:58:01,359 so as a consequence the intrinsic 1486 00:58:05,910 --> 00:58:04,400 refoldability of a protein is a bit of a 1487 00:58:08,309 --> 00:58:05,920 way of thinking about which ones were 1488 00:58:10,789 --> 00:58:08,319 probably easier to access before we have 1489 00:58:12,150 --> 00:58:10,799 more complex metabolism 1490 00:58:13,750 --> 00:58:12,160 in this note i'll point out that of 1491 00:58:16,069 --> 00:58:13,760 course one property of proteins that 1492 00:58:17,750 --> 00:58:16,079 makes makes it very different than rna 1493 00:58:19,270 --> 00:58:17,760 is that protein folding has a puzzle 1494 00:58:21,750 --> 00:58:19,280 like quality in which there's only 1495 00:58:23,990 --> 00:58:21,760 really one or small number of possible 1496 00:58:26,230 --> 00:58:24,000 solutions to minimize the energy which 1497 00:58:28,069 --> 00:58:26,240 is very different than rna where there 1498 00:58:29,589 --> 00:58:28,079 are many many different possible 1499 00:58:31,349 --> 00:58:29,599 near-degenerate 1500 00:58:33,510 --> 00:58:31,359 combinations that normally also have 1501 00:58:35,510 --> 00:58:33,520 reasonably low free energy and this is a 1502 00:58:37,510 --> 00:58:35,520 consideration that makes rna generally 1503 00:58:38,870 --> 00:58:37,520 less re-foldable than protein and 1504 00:58:40,870 --> 00:58:38,880 perhaps something that we should think 1505 00:58:42,309 --> 00:58:40,880 more about in the context of origins of 1506 00:58:44,390 --> 00:58:42,319 life 1507 00:58:46,069 --> 00:58:44,400 but with that point aside i want to 1508 00:58:47,750 --> 00:58:46,079 briefly illustrate an experiment that 1509 00:58:49,510 --> 00:58:47,760 our team has been developing to try to 1510 00:58:50,470 --> 00:58:49,520 explore refoldability on the proteome 1511 00:58:51,990 --> 00:58:50,480 scale 1512 00:58:54,069 --> 00:58:52,000 so the way this experiment works is we 1513 00:58:56,150 --> 00:58:54,079 start with cells we lyse them using 1514 00:58:57,829 --> 00:58:56,160 cryogenic pulverization which retains 1515 00:58:59,190 --> 00:58:57,839 the vast majority of proteins in their 1516 00:59:01,109 --> 00:58:59,200 native structure 1517 00:59:02,630 --> 00:59:01,119 we divide that sample in half so the one 1518 00:59:04,470 --> 00:59:02,640 half will do nothing we'll call that the 1519 00:59:06,390 --> 00:59:04,480 native sample to the other half we 1520 00:59:08,390 --> 00:59:06,400 globally unfold the entire proteome 1521 00:59:09,990 --> 00:59:08,400 using six molar guanidine and then 1522 00:59:11,910 --> 00:59:10,000 re-fold it by removing that guanidine 1523 00:59:14,390 --> 00:59:11,920 with a hundred full dilution 1524 00:59:16,789 --> 00:59:14,400 now the key part of this experiment is 1525 00:59:19,030 --> 00:59:16,799 that we then expose these two samples to 1526 00:59:21,910 --> 00:59:19,040 pulse proteolysis with this enzyme 1527 00:59:23,589 --> 00:59:21,920 called proteinase k now proteinase k is 1528 00:59:25,270 --> 00:59:23,599 a protease that has virtually no 1529 00:59:27,510 --> 00:59:25,280 sequence specificity so it can cut 1530 00:59:29,190 --> 00:59:27,520 between any two amino acids but it does 1531 00:59:30,789 --> 00:59:29,200 have a very strong preference to cut 1532 00:59:32,390 --> 00:59:30,799 regions that are more susceptible or 1533 00:59:34,789 --> 00:59:32,400 solvent exposed 1534 00:59:37,589 --> 00:59:34,799 so as a consequence protein sk allows us 1535 00:59:39,430 --> 00:59:37,599 to encode structural information about 1536 00:59:42,069 --> 00:59:39,440 what the conformational ensemble of the 1537 00:59:43,910 --> 00:59:42,079 protein looks like into cleavage events 1538 00:59:45,750 --> 00:59:43,920 and of course since we are ultimately a 1539 00:59:47,910 --> 00:59:45,760 mass spectrometry proteomics lab what we 1540 00:59:50,309 --> 00:59:47,920 are very good at doing is sequencing and 1541 00:59:52,950 --> 00:59:50,319 quantifying tens if not 20 000 different 1542 00:59:55,670 --> 00:59:52,960 peptides in one sample so by identifying 1543 00:59:57,750 --> 00:59:55,680 the um different peptidic fragments that 1544 00:59:59,430 --> 00:59:57,760 come from these digest we can address 1545 01:00:01,670 --> 00:59:59,440 the question of whether or not a protein 1546 01:00:03,829 --> 01:00:01,680 was conformationally identical in the 1547 01:00:05,430 --> 01:00:03,839 refolded sample in which case you'd 1548 01:00:07,589 --> 01:00:05,440 expect to get the same pattern of 1549 01:00:09,349 --> 01:00:07,599 fragments or for whatever reason 1550 01:00:12,230 --> 01:00:09,359 non-refoldable in which case we would 1551 01:00:14,789 --> 01:00:12,240 expect novel cleavage sites to appear 1552 01:00:18,150 --> 01:00:14,799 that were not available in the protein 1553 01:00:19,829 --> 01:00:18,160 when it was in its native folded form 1554 01:00:21,510 --> 01:00:19,839 so what do we get when we do this 1555 01:00:24,309 --> 01:00:21,520 experiment to e coli we find that 1556 01:00:25,829 --> 01:00:24,319 roughly 60 of e coli proteins are 1557 01:00:27,190 --> 01:00:25,839 refoldable 1558 01:00:28,789 --> 01:00:27,200 whether or not you consider that a lot 1559 01:00:30,789 --> 01:00:28,799 or a little sort of a glass half empty 1560 01:00:31,990 --> 01:00:30,799 glass half full the data set that 1561 01:00:33,430 --> 01:00:32,000 actually i'm going to be talking more 1562 01:00:35,750 --> 01:00:33,440 about in this presentation is when we 1563 01:00:37,510 --> 01:00:35,760 did the same experiment in yeast which 1564 01:00:39,349 --> 01:00:37,520 surprisingly actually has a higher 1565 01:00:40,789 --> 01:00:39,359 refoldability index and that's something 1566 01:00:42,710 --> 01:00:40,799 that we think there's a lot of really 1567 01:00:44,549 --> 01:00:42,720 interesting molecular biology associated 1568 01:00:45,990 --> 01:00:44,559 with but for the purpose of this talk 1569 01:00:47,910 --> 01:00:46,000 i'm just going to talk about our yeast 1570 01:00:49,589 --> 01:00:47,920 data set because the trends in it happen 1571 01:00:51,829 --> 01:00:49,599 to be cleaner because there's no there's 1572 01:00:53,990 --> 01:00:51,839 very little if not any aggregation in 1573 01:00:55,829 --> 01:00:54,000 these experiments 1574 01:00:57,349 --> 01:00:55,839 so what can we say about what types of 1575 01:00:59,190 --> 01:00:57,359 proteins are good at folding on their 1576 01:01:01,270 --> 01:00:59,200 own well one thing that we can do that's 1577 01:01:03,430 --> 01:01:01,280 very very simple is just divide up these 1578 01:01:05,430 --> 01:01:03,440 proteins into the number of domains that 1579 01:01:07,430 --> 01:01:05,440 they have and one thing that we find 1580 01:01:09,750 --> 01:01:07,440 very cleanly is that the more domains 1581 01:01:11,910 --> 01:01:09,760 that a protein has the harder it is at 1582 01:01:13,750 --> 01:01:11,920 folding and this makes a lot of sense 1583 01:01:16,710 --> 01:01:13,760 because it's long been hypothesized that 1584 01:01:18,630 --> 01:01:16,720 multi-domain proteins rely more on 1585 01:01:20,870 --> 01:01:18,640 folding on the ribosome or so-called 1586 01:01:22,069 --> 01:01:20,880 co-translational folding and the reason 1587 01:01:24,470 --> 01:01:22,079 why that is is because when you're 1588 01:01:26,710 --> 01:01:24,480 folding on the ribosome the first domain 1589 01:01:28,950 --> 01:01:26,720 can fold before the second domain has 1590 01:01:30,710 --> 01:01:28,960 even been formed and the second domain 1591 01:01:32,870 --> 01:01:30,720 can fold after the first domain has 1592 01:01:35,109 --> 01:01:32,880 already folded so it acts as a 1593 01:01:37,349 --> 01:01:35,119 convenient way of decoupling the folding 1594 01:01:39,589 --> 01:01:37,359 of complex objects which of course is 1595 01:01:41,589 --> 01:01:39,599 not available when you're doing 1596 01:01:43,670 --> 01:01:41,599 refolding of a completely denatured 1597 01:01:45,190 --> 01:01:43,680 chain 1598 01:01:47,589 --> 01:01:45,200 now the other thing that we can do is we 1599 01:01:49,670 --> 01:01:47,599 can look at these individual domains and 1600 01:01:51,910 --> 01:01:49,680 classify them into an evolutionary 1601 01:01:53,670 --> 01:01:51,920 lineage and to do that we make use of 1602 01:01:55,910 --> 01:01:53,680 the e-cod system that you've heard about 1603 01:01:57,670 --> 01:01:55,920 from claudia as well as from liam this 1604 01:02:02,150 --> 01:01:57,680 is a way of classifying the protein 1605 01:02:07,109 --> 01:02:05,190 fold groups that have a common ancestor 1606 01:02:09,270 --> 01:02:07,119 and one thing that we find is that the 1607 01:02:10,789 --> 01:02:09,280 types of protein or the types of domains 1608 01:02:12,950 --> 01:02:10,799 rather i should say that are 1609 01:02:14,950 --> 01:02:12,960 extraordinarily refoldable have a lot of 1610 01:02:17,829 --> 01:02:14,960 traits in common they are generally 1611 01:02:19,029 --> 01:02:17,839 small they are generally all alpha or 1612 01:02:21,990 --> 01:02:19,039 all beta 1613 01:02:24,390 --> 01:02:22,000 and they are highly represented amongst 1614 01:02:26,710 --> 01:02:24,400 folds that bind to nucleotides and small 1615 01:02:29,109 --> 01:02:26,720 peptides and in that group it's both the 1616 01:02:31,430 --> 01:02:29,119 sh3 fold and the ob fold that claudia 1617 01:02:33,430 --> 01:02:31,440 was telling us a lot about what we find 1618 01:02:35,510 --> 01:02:33,440 is that the worst refolding in every 1619 01:02:37,430 --> 01:02:35,520 organism that we've looked at so far is 1620 01:02:39,910 --> 01:02:37,440 always found amongst folds that are 1621 01:02:42,549 --> 01:02:39,920 associated with the aminoacyl trna 1622 01:02:44,710 --> 01:02:42,559 synthetases as well as tin barrels with 1623 01:02:46,390 --> 01:02:44,720 rosmans and p-loops not being so far 1624 01:02:47,990 --> 01:02:46,400 behind 1625 01:02:49,589 --> 01:02:48,000 so just to sort of put a picture onto 1626 01:02:50,950 --> 01:02:49,599 some of these domains if you're not uh 1627 01:02:53,270 --> 01:02:50,960 used to looking at lots of different 1628 01:02:55,670 --> 01:02:53,280 protein structures again also reinforced 1629 01:02:57,829 --> 01:02:55,680 the sh3 and ob are these small albedo 1630 01:02:59,910 --> 01:02:57,839 folds the helix turn helix is a small 1631 01:03:01,990 --> 01:02:59,920 alpha fold and of course tim barrels and 1632 01:03:03,670 --> 01:03:02,000 rosemans are alpha slash beta folds that 1633 01:03:07,270 --> 01:03:03,680 tend to be larger and more topologically 1634 01:03:08,870 --> 01:03:07,280 complex and have a greater contact order 1635 01:03:10,390 --> 01:03:08,880 another thing that we can do is organize 1636 01:03:12,549 --> 01:03:10,400 these proteins on the basis of their 1637 01:03:14,710 --> 01:03:12,559 acidity what we find is that the worst 1638 01:03:15,910 --> 01:03:14,720 three folders are mildly acidic so that 1639 01:03:18,710 --> 01:03:15,920 means that these are things that have a 1640 01:03:21,190 --> 01:03:18,720 isoelectric point between five and seven 1641 01:03:22,950 --> 01:03:21,200 very acidic proteins tend to be pretty 1642 01:03:26,069 --> 01:03:22,960 good refolders and that i think bodes 1643 01:03:27,589 --> 01:03:26,079 well for hypotheses about um the ancient 1644 01:03:29,349 --> 01:03:27,599 proteins that were of course highly 1645 01:03:31,589 --> 01:03:29,359 acidic and would have had pis less than 1646 01:03:33,990 --> 01:03:31,599 five but we also find that very basic 1647 01:03:36,069 --> 01:03:34,000 proteins also tend to refold very well 1648 01:03:37,589 --> 01:03:36,079 and here our hypothesis is possibly that 1649 01:03:40,069 --> 01:03:37,599 these are proteins whose folding is 1650 01:03:42,789 --> 01:03:40,079 chaperoned by rna 1651 01:03:44,789 --> 01:03:42,799 now on that topic we can also look um 1652 01:03:46,390 --> 01:03:44,799 closely at the ribosomal proteins and 1653 01:03:48,870 --> 01:03:46,400 when we did that we found a truly 1654 01:03:51,349 --> 01:03:48,880 shocking discovery and that is that in 1655 01:03:53,990 --> 01:03:51,359 both e coli and in yeast the large 1656 01:03:55,910 --> 01:03:54,000 subunit is almost entirely refoldable in 1657 01:03:57,430 --> 01:03:55,920 yeast it's completely refoldable and 1658 01:03:59,589 --> 01:03:57,440 i'll remind you that this was not in 1659 01:04:01,349 --> 01:03:59,599 some pre-ordained biochemical reaction 1660 01:04:04,309 --> 01:04:01,359 this was literally refolding entire 1661 01:04:06,549 --> 01:04:04,319 extracts so lots of components very 1662 01:04:09,109 --> 01:04:06,559 messy the small subunit on the other 1663 01:04:10,549 --> 01:04:09,119 hand tends to be much less refoldable 1664 01:04:12,549 --> 01:04:10,559 and we think that this this is an 1665 01:04:14,549 --> 01:04:12,559 interesting finding that possibly points 1666 01:04:16,789 --> 01:04:14,559 to the antiquity of the large samina or 1667 01:04:18,470 --> 01:04:16,799 lisa's function in relation to the small 1668 01:04:19,750 --> 01:04:18,480 subunit 1669 01:04:21,349 --> 01:04:19,760 the final result that i'll share with 1670 01:04:23,270 --> 01:04:21,359 you is that we did this same refolding 1671 01:04:25,190 --> 01:04:23,280 reaction in thermos thermophilus which 1672 01:04:27,029 --> 01:04:25,200 is a model thermophile and we were 1673 01:04:28,230 --> 01:04:27,039 actually very struck by the finding that 1674 01:04:30,630 --> 01:04:28,240 actually in contrast to what we 1675 01:04:32,710 --> 01:04:30,640 hypothesized proteins from thermists 1676 01:04:35,589 --> 01:04:32,720 were miserable refolders they were much 1677 01:04:37,990 --> 01:04:35,599 worse than e coli and yeast 1678 01:04:40,910 --> 01:04:38,000 so why do we think this is we think that 1679 01:04:43,589 --> 01:04:40,920 the way that evolution is able to create 1680 01:04:45,349 --> 01:04:43,599 thermo-tolerant proteins is maybe not 1681 01:04:48,549 --> 01:04:45,359 through this classical mechanism of 1682 01:04:50,549 --> 01:04:48,559 having a very stable protein with a low 1683 01:04:53,990 --> 01:04:50,559 gibbs free energy but rather through a 1684 01:04:56,470 --> 01:04:54,000 kinetic trapping mechanism whereby the 1685 01:04:59,029 --> 01:04:56,480 barriers to exit the native state become 1686 01:05:00,870 --> 01:04:59,039 very high thereby trapping the protein 1687 01:05:03,670 --> 01:05:00,880 preventing thermal fluctuations from 1688 01:05:05,270 --> 01:05:03,680 unfolding it but by that same token it 1689 01:05:07,270 --> 01:05:05,280 means that if you wanted to refold that 1690 01:05:09,270 --> 01:05:07,280 protein after it was unfolded you'd be 1691 01:05:12,309 --> 01:05:09,280 in trouble because now those barriers 1692 01:05:14,789 --> 01:05:12,319 are going to act in both directions 1693 01:05:16,870 --> 01:05:14,799 so i'll summarize by trying to let you 1694 01:05:18,789 --> 01:05:16,880 know some of our current thinking about 1695 01:05:20,390 --> 01:05:18,799 how refoldability has affected the way 1696 01:05:22,390 --> 01:05:20,400 that at least in our lab we think about 1697 01:05:24,069 --> 01:05:22,400 the origins of life first of all we 1698 01:05:26,309 --> 01:05:24,079 think that the best three folders were 1699 01:05:29,349 --> 01:05:26,319 small topologically simple proteins that 1700 01:05:31,670 --> 01:05:29,359 bind peptides and nucleosides explicitly 1701 01:05:33,510 --> 01:05:31,680 not the synthetase folds now in some 1702 01:05:35,349 --> 01:05:33,520 ways this is maybe almost obvious you 1703 01:05:37,750 --> 01:05:35,359 know once you say it because synthetases 1704 01:05:39,109 --> 01:05:37,760 tend to be large multi-domain proteins 1705 01:05:40,870 --> 01:05:39,119 but i think it's worth pointing out that 1706 01:05:43,109 --> 01:05:40,880 this sort of notion that these represent 1707 01:05:45,109 --> 01:05:43,119 the most ancient proteins probably 1708 01:05:47,910 --> 01:05:45,119 represents a ripple from the an 1709 01:05:50,230 --> 01:05:47,920 implausibly strong rna world hypothesis 1710 01:05:52,470 --> 01:05:50,240 in which it has been positive by some 1711 01:05:53,990 --> 01:05:52,480 that proteins only became important once 1712 01:05:55,829 --> 01:05:54,000 you could encode them with an rna 1713 01:05:56,950 --> 01:05:55,839 template and of course in that train of 1714 01:05:59,109 --> 01:05:56,960 thought you couldn't even create 1715 01:06:00,870 --> 01:05:59,119 proteins until you had synthetases we 1716 01:06:03,270 --> 01:06:00,880 think the evidence from refoldability is 1717 01:06:05,349 --> 01:06:03,280 not consistent with that point of view 1718 01:06:07,270 --> 01:06:05,359 secondly we think that the large subunit 1719 01:06:09,270 --> 01:06:07,280 predated the small subunit so we think 1720 01:06:11,510 --> 01:06:09,280 the early life benefited from a catalyst 1721 01:06:13,430 --> 01:06:11,520 that could make peptide bonds before you 1722 01:06:15,589 --> 01:06:13,440 were able to encode that information in 1723 01:06:17,750 --> 01:06:15,599 a nucleic acid template we think that 1724 01:06:19,910 --> 01:06:17,760 that thinking nicely coheres with the 1725 01:06:21,510 --> 01:06:19,920 evolutionary and structural analysis 1726 01:06:23,910 --> 01:06:21,520 that the williams group has been working 1727 01:06:25,750 --> 01:06:23,920 on for several decades 1728 01:06:27,430 --> 01:06:25,760 we think that one thing that kind of 1729 01:06:29,510 --> 01:06:27,440 struck to us is that tim barrels 1730 01:06:31,109 --> 01:06:29,520 actually are pretty miserable refolders 1731 01:06:33,910 --> 01:06:31,119 we think that's because these like key 1732 01:06:36,549 --> 01:06:33,920 metabolic processes co-evolved with 1733 01:06:38,150 --> 01:06:36,559 translation so essentially once you have 1734 01:06:40,150 --> 01:06:38,160 translation you can start to create 1735 01:06:42,230 --> 01:06:40,160 proteins that are addicted to 1736 01:06:43,750 --> 01:06:42,240 translation in order to be able to fold 1737 01:06:45,750 --> 01:06:43,760 properly and so we think that 1738 01:06:48,230 --> 01:06:45,760 translation and glycolysis and the 1739 01:06:49,750 --> 01:06:48,240 synthetases by um const by consequence 1740 01:06:51,910 --> 01:06:49,760 co-evolve together 1741 01:06:53,829 --> 01:06:51,920 and finally we think that it would have 1742 01:06:55,510 --> 01:06:53,839 been actually relatively difficult to 1743 01:06:57,430 --> 01:06:55,520 initially evolve proteins in a 1744 01:06:59,270 --> 01:06:57,440 thermophilic setting because it seems 1745 01:07:01,589 --> 01:06:59,280 that thermophilic proteins are more 1746 01:07:03,750 --> 01:07:01,599 reliant on a robust translational 1747 01:07:06,150 --> 01:07:03,760 apparatus in order to create these 1748 01:07:07,670 --> 01:07:06,160 kinetically trapped folds so in essence 1749 01:07:09,990 --> 01:07:07,680 if we had seen that thermophilic 1750 01:07:12,710 --> 01:07:10,000 proteins refold very easily we might 1751 01:07:14,630 --> 01:07:12,720 have been able to accept the hypothesis 1752 01:07:17,510 --> 01:07:14,640 that these were ancient proteins that 1753 01:07:19,990 --> 01:07:17,520 were more easily able to assemble before 1754 01:07:22,150 --> 01:07:20,000 the advent of translation but that's not 1755 01:07:23,910 --> 01:07:22,160 exactly what our results show i'll put 1756 01:07:25,670 --> 01:07:23,920 some asterisks there because i think we 1757 01:07:27,430 --> 01:07:25,680 need to test the hypothesis on more 1758 01:07:29,670 --> 01:07:27,440 thermophiles first but that is where our 1759 01:07:32,309 --> 01:07:29,680 current evidence is taking us 1760 01:07:34,950 --> 01:07:32,319 so with that i want to conclude just by 1761 01:07:36,789 --> 01:07:34,960 acknowledging the extreme 1762 01:07:38,549 --> 01:07:36,799 importance that dan toffee has had in 1763 01:07:40,230 --> 01:07:38,559 shaping the thinking i think of a lot of 1764 01:07:41,430 --> 01:07:40,240 the people in this room as well as the 1765 01:07:43,029 --> 01:07:41,440 speakers 1766 01:07:44,710 --> 01:07:43,039 he's dearly missed and i'm glad that 1767 01:07:46,950 --> 01:07:44,720 we're able to 1768 01:07:48,870 --> 01:07:46,960 have a number of his trainees and 1769 01:07:56,470 --> 01:07:48,880 collaborators able to with to speak with 1770 01:08:01,029 --> 01:07:58,870 unfortunately we don't have time for 1771 01:08:08,150 --> 01:08:01,039 questions but at the end we will have 1772 01:08:13,510 --> 01:08:12,230 so our next speaker will be giving 1773 01:08:14,950 --> 01:08:13,520 a talk 1774 01:08:15,910 --> 01:08:14,960 remotely 1775 01:08:17,829 --> 01:08:15,920 it's 1776 01:08:22,229 --> 01:08:17,839 liam longo 1777 01:08:27,749 --> 01:08:24,709 i don't have the information uh from the 1778 01:08:42,390 --> 01:08:27,759 tokyo uh lc in tokyo 1779 01:08:42,400 --> 01:08:48,149 uh we don't have sound 1780 01:08:56,149 --> 01:08:51,669 well dna and rna i'm going to replay it 1781 01:09:00,229 --> 01:08:58,309 hello the title of my talk today is 1782 01:09:02,630 --> 01:09:00,239 through the looking glass functional 1783 01:09:06,470 --> 01:09:02,640 ambidexterity in an ancient nucleic acid 1784 01:09:08,390 --> 01:09:06,480 binding protein and i'm liam longo from 1785 01:09:10,950 --> 01:09:08,400 elsie at the tokyo institute of 1786 01:09:13,030 --> 01:09:10,960 technology and this is a joint project 1787 01:09:16,229 --> 01:09:13,040 with norman matanis at the hebrew 1788 01:09:18,229 --> 01:09:16,239 university of jerusalem 1789 01:09:21,030 --> 01:09:18,239 biopolymers as we all know are 1790 01:09:22,470 --> 01:09:21,040 exquisitely homochiral proteins use l 1791 01:09:25,349 --> 01:09:22,480 amino acids 1792 01:09:27,349 --> 01:09:25,359 while dna and rna are derived from 1793 01:09:28,229 --> 01:09:27,359 d-ribose 1794 01:09:30,709 --> 01:09:28,239 and so 1795 01:09:33,510 --> 01:09:30,719 while homochirality is the rule in 1796 01:09:34,950 --> 01:09:33,520 biology its origins are actually quite 1797 01:09:37,189 --> 01:09:34,960 mysterious 1798 01:09:39,269 --> 01:09:37,199 i think everyone here would agree that 1799 01:09:41,749 --> 01:09:39,279 homochirality probably predates the 1800 01:09:43,990 --> 01:09:41,759 leuka the exact point of emergence of 1801 01:09:46,390 --> 01:09:44,000 homochirality is unclear 1802 01:09:49,430 --> 01:09:46,400 and it's also unclear to what extent the 1803 01:09:50,550 --> 01:09:49,440 emergence of chomo chirality and rna was 1804 01:09:52,309 --> 01:09:50,560 coupled 1805 01:09:53,910 --> 01:09:52,319 to the emergence of homochiraldine 1806 01:09:56,390 --> 01:09:53,920 protein 1807 01:09:57,830 --> 01:09:56,400 and so although there are some very 1808 01:10:01,030 --> 01:09:57,840 interesting mechanisms that have been 1809 01:10:03,669 --> 01:10:01,040 proposed that can result in enantiomeric 1810 01:10:05,830 --> 01:10:03,679 excess in chemical systems 1811 01:10:08,310 --> 01:10:05,840 i think it's safe to say that the 1812 01:10:11,430 --> 01:10:08,320 question of homochirology and biology is 1813 01:10:14,870 --> 01:10:13,189 the veil between enantiomers is the 1814 01:10:16,709 --> 01:10:14,880 result of billions of years of 1815 01:10:19,110 --> 01:10:16,719 biological evolution 1816 01:10:22,310 --> 01:10:19,120 and the consequences of this veil were 1817 01:10:24,870 --> 01:10:22,320 first demonstrated by milton and kent 1818 01:10:27,270 --> 01:10:24,880 what milton and kent did is they 1819 01:10:29,750 --> 01:10:27,280 inverted the chirality of either hiv 1820 01:10:32,229 --> 01:10:29,760 protease or its substrate and they 1821 01:10:36,630 --> 01:10:32,239 showed that if you use the unnatural 1822 01:10:39,669 --> 01:10:36,640 couple so either lnd or dnl 1823 01:10:42,550 --> 01:10:39,679 you abolished activity but if you used 1824 01:10:44,630 --> 01:10:42,560 the natural couple or its mirror image 1825 01:10:46,550 --> 01:10:44,640 you actually had near equivalent 1826 01:10:49,270 --> 01:10:46,560 activity 1827 01:10:51,189 --> 01:10:49,280 and since then several technologies like 1828 01:10:53,990 --> 01:10:51,199 mirror image phase display and 1829 01:10:55,750 --> 01:10:54,000 sphegelmers have been developed to take 1830 01:10:58,790 --> 01:10:55,760 advantage of the properties of mirror 1831 01:11:01,189 --> 01:10:58,800 image molecules spiegelmers for example 1832 01:11:03,270 --> 01:11:01,199 are aftermers with high plasma stability 1833 01:11:05,189 --> 01:11:03,280 and low immunogenicity and this is 1834 01:11:07,590 --> 01:11:05,199 because they don't interact strongly 1835 01:11:09,189 --> 01:11:07,600 with nucleases or nucleic acid binding 1836 01:11:12,070 --> 01:11:09,199 proteins in the cell 1837 01:11:14,310 --> 01:11:12,080 but we wondered do the same truths hold 1838 01:11:16,310 --> 01:11:14,320 for the most ancient proteins 1839 01:11:17,590 --> 01:11:16,320 are they also highly sensitive to chiral 1840 01:11:20,550 --> 01:11:17,600 inversion 1841 01:11:23,430 --> 01:11:20,560 to ask this question we turn to a motif 1842 01:11:26,229 --> 01:11:23,440 called the helix herpen helix motif 1843 01:11:28,870 --> 01:11:26,239 and vikram alva and andre lupus have 1844 01:11:31,590 --> 01:11:28,880 shown that this is one of the most 1845 01:11:33,990 --> 01:11:31,600 ancient peptides and was at the origin 1846 01:11:36,070 --> 01:11:34,000 of folded proteins 1847 01:11:38,229 --> 01:11:36,080 what we've done previously 1848 01:11:40,149 --> 01:11:38,239 is we've used a combination of ancestor 1849 01:11:41,350 --> 01:11:40,159 reconstruction techniques and protein 1850 01:11:44,229 --> 01:11:41,360 engineering 1851 01:11:46,950 --> 01:11:44,239 to simplify the sequence of this motif 1852 01:11:49,750 --> 01:11:46,960 so that we can track its evolution from 1853 01:11:52,550 --> 01:11:49,760 a relatively unstructured peptide that 1854 01:11:54,870 --> 01:11:52,560 phase separates with dna into a folded 1855 01:11:56,950 --> 01:11:54,880 domain with specific double strand dna 1856 01:11:59,030 --> 01:11:56,960 binding activity 1857 01:12:00,950 --> 01:11:59,040 here is that model in a little bit more 1858 01:12:02,709 --> 01:12:00,960 detail 1859 01:12:05,990 --> 01:12:02,719 a long long time ago 1860 01:12:08,550 --> 01:12:06,000 we had flexible peptides probably with a 1861 01:12:11,590 --> 01:12:08,560 poly basic sequence composition that 1862 01:12:13,990 --> 01:12:11,600 formed coastavates with rna 1863 01:12:16,229 --> 01:12:14,000 over time those peptides became 1864 01:12:18,709 --> 01:12:16,239 more complicated and they were able to 1865 01:12:20,870 --> 01:12:18,719 adopt compact structures 1866 01:12:22,870 --> 01:12:20,880 these compact structures in the case of 1867 01:12:25,110 --> 01:12:22,880 the helix herpen helix motif could 1868 01:12:27,030 --> 01:12:25,120 potentially dimerize and these dimers 1869 01:12:29,430 --> 01:12:27,040 could promote the formation of more 1870 01:12:30,790 --> 01:12:29,440 stable coast surveys or phase separated 1871 01:12:32,070 --> 01:12:30,800 droplets 1872 01:12:33,830 --> 01:12:32,080 eventually 1873 01:12:34,950 --> 01:12:33,840 upon duplication and fusion of this 1874 01:12:38,229 --> 01:12:34,960 motif 1875 01:12:40,390 --> 01:12:38,239 we could achieve what is now observed as 1876 01:12:42,790 --> 01:12:40,400 an independently folding double strand 1877 01:12:45,830 --> 01:12:42,800 dna binding domain 1878 01:12:48,709 --> 01:12:45,840 remarkably we've been able to track 1879 01:12:50,630 --> 01:12:48,719 every one of these stages experimentally 1880 01:12:52,870 --> 01:12:50,640 in the laboratory 1881 01:12:55,189 --> 01:12:52,880 and so we recently submitted an article 1882 01:12:57,669 --> 01:12:55,199 in collaboration with daniela goldfarb 1883 01:13:00,470 --> 01:12:57,679 and the nasil where we characterize the 1884 01:13:01,990 --> 01:13:00,480 presence of these dimers inside the 1885 01:13:04,470 --> 01:13:02,000 coast survey 1886 01:13:06,709 --> 01:13:04,480 and with this model system in hand we 1887 01:13:08,229 --> 01:13:06,719 ask the question at what stage does 1888 01:13:10,149 --> 01:13:08,239 chirality matter 1889 01:13:12,950 --> 01:13:10,159 does it matter at the stage of forming 1890 01:13:15,910 --> 01:13:12,960 coesurvates by a simple dimerizing 1891 01:13:17,350 --> 01:13:15,920 peptide or does it matter at the level 1892 01:13:19,830 --> 01:13:17,360 of an independently folding 1893 01:13:21,910 --> 01:13:19,840 double-stranded dna binding domain 1894 01:13:24,070 --> 01:13:21,920 we started off by testing whether or not 1895 01:13:26,149 --> 01:13:24,080 coastervation or phase separation was 1896 01:13:29,030 --> 01:13:26,159 sensitive to chiral inversion and so 1897 01:13:32,870 --> 01:13:29,040 we'd previously shown that the l-peptide 1898 01:13:34,630 --> 01:13:32,880 coastervates strongly with polyu 1899 01:13:36,950 --> 01:13:34,640 when we inverted the chirality of the 1900 01:13:39,669 --> 01:13:36,960 l-peptide to form the d-peptide the 1901 01:13:42,229 --> 01:13:39,679 mir-image peptide we found that it still 1902 01:13:44,950 --> 01:13:42,239 formed coastervades with polyu 1903 01:13:46,790 --> 01:13:44,960 the differences here are because of the 1904 01:13:48,950 --> 01:13:46,800 cover slip we're using 1905 01:13:50,709 --> 01:13:48,960 it's not it's not a fundamental property 1906 01:13:53,030 --> 01:13:50,719 of the system 1907 01:13:55,910 --> 01:13:53,040 nevertheless using a nano site we were 1908 01:13:58,790 --> 01:13:55,920 able to see that actually the l-peptide 1909 01:14:03,110 --> 01:13:58,800 formed slightly more droplets than the 1910 01:14:05,590 --> 01:14:03,120 d-peptide at identical concentrations 1911 01:14:09,270 --> 01:14:05,600 now if you'll remember we previously 1912 01:14:11,430 --> 01:14:09,280 showed that inside the droplets there is 1913 01:14:13,669 --> 01:14:11,440 some folding of our peptide so we wanted 1914 01:14:16,070 --> 01:14:13,679 to test whether or not that folding was 1915 01:14:18,390 --> 01:14:16,080 important for coastervation 1916 01:14:22,229 --> 01:14:18,400 and to do that we generated a peptide 1917 01:14:24,310 --> 01:14:22,239 that had alternating d and l amino acids 1918 01:14:27,510 --> 01:14:24,320 such a peptide is unable to fold it's 1919 01:14:29,189 --> 01:14:27,520 unable to form alpha helices so 1920 01:14:31,510 --> 01:14:29,199 we tested whether or not this could 1921 01:14:34,149 --> 01:14:31,520 coast and indeed it could also 1922 01:14:36,870 --> 01:14:34,159 coastervate but it did so 1923 01:14:40,070 --> 01:14:36,880 with a lower propensity than either the 1924 01:14:42,790 --> 01:14:40,080 d-peptide or the l-peptide both of which 1925 01:14:45,990 --> 01:14:42,800 have the ability to fold and so we must 1926 01:14:49,990 --> 01:14:46,000 conclude that coastervation and face 1927 01:14:52,070 --> 01:14:50,000 separation is robust to chiral inversion 1928 01:14:54,390 --> 01:14:52,080 and this isn't perhaps very surprising 1929 01:14:56,470 --> 01:14:54,400 because it's already been shown that 1930 01:14:59,030 --> 01:14:56,480 largely unstructured peptides 1931 01:15:02,310 --> 01:14:59,040 can phase separate with double-stranded 1932 01:15:04,310 --> 01:15:02,320 or single-stranded dna or rna 1933 01:15:06,070 --> 01:15:04,320 this is perhaps because the nature of 1934 01:15:10,550 --> 01:15:06,080 the interactions that drive face 1935 01:15:12,390 --> 01:15:10,560 separation tend to be transient and weak 1936 01:15:14,310 --> 01:15:12,400 this is not the case for an 1937 01:15:17,030 --> 01:15:14,320 independently folding domain binding to 1938 01:15:18,709 --> 01:15:17,040 double-strand dna so how do we expect 1939 01:15:21,189 --> 01:15:18,719 this domain to withstand chiral 1940 01:15:23,189 --> 01:15:21,199 inversion to answer this question we 1941 01:15:25,990 --> 01:15:23,199 synthesized the full-length 1942 01:15:27,990 --> 01:15:26,000 double-strand dna binding domain in both 1943 01:15:29,510 --> 01:15:28,000 the mirror image chirality and the 1944 01:15:31,350 --> 01:15:29,520 natural chirality 1945 01:15:33,270 --> 01:15:31,360 and so this is the circular dichroism 1946 01:15:35,590 --> 01:15:33,280 spectra which reports on the secondary 1947 01:15:37,669 --> 01:15:35,600 structure of our domain we can see that 1948 01:15:41,270 --> 01:15:37,679 both domains are alpha-helical so they 1949 01:15:42,630 --> 01:15:41,280 have peaks at about 208 and 222 but that 1950 01:15:44,790 --> 01:15:42,640 they have an inverted circular 1951 01:15:47,669 --> 01:15:44,800 dichroisin spectrum because they have 1952 01:15:48,630 --> 01:15:47,679 helices of opposite handedness 1953 01:15:51,270 --> 01:15:48,640 now 1954 01:15:53,590 --> 01:15:51,280 using these two proteins we tested their 1955 01:15:54,950 --> 01:15:53,600 ability to bind double-strand dna using 1956 01:15:56,950 --> 01:15:54,960 spr 1957 01:16:00,870 --> 01:15:56,960 and we tested their ability to bind not 1958 01:16:02,149 --> 01:16:00,880 just the natural dna but we also used 1959 01:16:04,709 --> 01:16:02,159 ldna 1960 01:16:06,470 --> 01:16:04,719 this makes it so that our experiment has 1961 01:16:08,709 --> 01:16:06,480 a natural control 1962 01:16:11,430 --> 01:16:08,719 embedded in it because we expect that 1963 01:16:13,830 --> 01:16:11,440 the mere universe should have similar 1964 01:16:15,669 --> 01:16:13,840 affinities to our universe 1965 01:16:18,550 --> 01:16:15,679 and so as you can see here 1966 01:16:21,430 --> 01:16:18,560 both the l protein binding to the d dna 1967 01:16:24,630 --> 01:16:21,440 and the d protein binding to the l dna 1968 01:16:27,510 --> 01:16:24,640 they have a similar interaction affinity 1969 01:16:30,149 --> 01:16:27,520 surprisingly when we looked at l protein 1970 01:16:32,950 --> 01:16:30,159 binding to ldna or d protein binding to 1971 01:16:34,790 --> 01:16:32,960 d dna that is the case where only one of 1972 01:16:36,550 --> 01:16:34,800 the binding partners has an inverted 1973 01:16:39,110 --> 01:16:36,560 chirality 1974 01:16:41,430 --> 01:16:39,120 we still saw significant evidence of 1975 01:16:43,510 --> 01:16:41,440 binding and even in the tens of 1976 01:16:46,630 --> 01:16:43,520 micromolar concentration we have 1977 01:16:49,030 --> 01:16:46,640 unambiguous evidence of binding of our 1978 01:16:50,790 --> 01:16:49,040 protein to the dna 1979 01:16:52,470 --> 01:16:50,800 and we wanted to assess whether or not 1980 01:16:54,870 --> 01:16:52,480 this was the result of the background 1981 01:16:58,229 --> 01:16:54,880 binding of the fold itself and not the 1982 01:17:01,189 --> 01:16:58,239 result of specific binding to our domain 1983 01:17:02,790 --> 01:17:01,199 to do this we mutated the canonical 1984 01:17:05,910 --> 01:17:02,800 pgigp 1985 01:17:07,910 --> 01:17:05,920 binding loops to five glycines this is 1986 01:17:10,070 --> 01:17:07,920 in a sense an entropy mutation because 1987 01:17:12,229 --> 01:17:10,080 it doesn't change the overall charge of 1988 01:17:14,950 --> 01:17:12,239 the protein it just makes it so that 1989 01:17:16,709 --> 01:17:14,960 these loops are more flexible and thus 1990 01:17:18,870 --> 01:17:16,719 less likely to adopt the correct 1991 01:17:20,950 --> 01:17:18,880 confirmation for binding 1992 01:17:23,830 --> 01:17:20,960 when we do this we observe that the l 1993 01:17:25,030 --> 01:17:23,840 primordial rh protein with the five 1994 01:17:28,070 --> 01:17:25,040 glycines 1995 01:17:30,550 --> 01:17:28,080 actually binds worse than total chiral 1996 01:17:33,590 --> 01:17:30,560 inversion of the protein domain 1997 01:17:36,149 --> 01:17:33,600 on 29 base pair double stranded dna in 1998 01:17:39,189 --> 01:17:36,159 the natural chiral conformation we can 1999 01:17:42,070 --> 01:17:39,199 see that the d mere protein binds better 2000 01:17:44,470 --> 01:17:42,080 than the l primordial arch protein with 2001 01:17:46,229 --> 01:17:44,480 the 5g mutation 2002 01:17:47,910 --> 01:17:46,239 when you look at 101 base pair 2003 01:17:50,229 --> 01:17:47,920 double-stranded dna 2004 01:17:52,229 --> 01:17:50,239 we see the difference is even larger and 2005 01:17:55,030 --> 01:17:52,239 this is because in our system we've 2006 01:17:57,270 --> 01:17:55,040 observed that the longer the dna strand 2007 01:17:59,510 --> 01:17:57,280 the higher the binding affinity perhaps 2008 01:18:02,070 --> 01:17:59,520 due to some cooperativity 2009 01:18:05,030 --> 01:18:02,080 it's relatively easy to understand how a 2010 01:18:07,110 --> 01:18:05,040 single helix hairpin helix motif could 2011 01:18:08,790 --> 01:18:07,120 bind to double-stranded dna or 2012 01:18:11,510 --> 01:18:08,800 single-stranded dna 2013 01:18:14,550 --> 01:18:11,520 regardless of its chirality 2014 01:18:17,030 --> 01:18:14,560 what's harder to understand is how when 2015 01:18:19,270 --> 01:18:17,040 you have a duplicated domain and these 2016 01:18:20,229 --> 01:18:19,280 two loops are juxtaposed relative to 2017 01:18:22,390 --> 01:18:20,239 each other 2018 01:18:24,550 --> 01:18:22,400 how they could correctly insert into the 2019 01:18:26,470 --> 01:18:24,560 minor groove without a significant 2020 01:18:28,149 --> 01:18:26,480 rearrangement this is a question that 2021 01:18:31,590 --> 01:18:28,159 we're currently addressing with md 2022 01:18:35,910 --> 01:18:33,510 but now we have to grapple with the 2023 01:18:38,229 --> 01:18:35,920 question which is why would an ancient 2024 01:18:40,550 --> 01:18:38,239 nucleic acid binding domain be 2025 01:18:43,270 --> 01:18:40,560 ambidextrous why should an ancient 2026 01:18:45,750 --> 01:18:43,280 domain be able to bind in effectively 2027 01:18:47,669 --> 01:18:45,760 both chiral forms 2028 01:18:49,350 --> 01:18:47,679 and i would like to acknowledge right 2029 01:18:51,590 --> 01:18:49,360 out of the gate that this could be the 2030 01:18:54,550 --> 01:18:51,600 result of chance 2031 01:18:56,550 --> 01:18:54,560 some domains are surely ambidextrous 2032 01:18:58,709 --> 01:18:56,560 just by chance and that this has nothing 2033 01:19:00,550 --> 01:18:58,719 to do with the early history of the fold 2034 01:19:03,110 --> 01:19:00,560 and so if this was the case it would 2035 01:19:05,350 --> 01:19:03,120 predict that as we test more domains for 2036 01:19:07,110 --> 01:19:05,360 this property of ambidexterity the 2037 01:19:08,790 --> 01:19:07,120 ancient domains will have no greater 2038 01:19:10,950 --> 01:19:08,800 preference for amber dexterity than any 2039 01:19:13,510 --> 01:19:10,960 other fold so i want to acknowledge this 2040 01:19:15,830 --> 01:19:13,520 possibility right at the outset i think 2041 01:19:17,669 --> 01:19:15,840 it's a very reasonable one 2042 01:19:19,350 --> 01:19:17,679 but i'd also like to lean into the 2043 01:19:21,590 --> 01:19:19,360 result a bit more 2044 01:19:24,310 --> 01:19:21,600 what would it mean if the history of 2045 01:19:26,630 --> 01:19:24,320 homo chirality was written into the most 2046 01:19:29,590 --> 01:19:26,640 ancient domains and if this history was 2047 01:19:31,830 --> 01:19:29,600 somehow observable by their ability to 2048 01:19:33,350 --> 01:19:31,840 be ambidextrous 2049 01:19:35,590 --> 01:19:33,360 what would that mean 2050 01:19:38,950 --> 01:19:35,600 and could that be a relic of a time when 2051 01:19:40,709 --> 01:19:38,960 amino acid preferences were emerging in 2052 01:19:42,229 --> 01:19:40,719 a complex community of competing 2053 01:19:44,390 --> 01:19:42,239 organisms 2054 01:19:47,910 --> 01:19:44,400 in the model i've got here 2055 01:19:50,790 --> 01:19:47,920 we have an ancient ribosome a primitive 2056 01:19:53,189 --> 01:19:50,800 rna-based translation machine and it has 2057 01:19:54,550 --> 01:19:53,199 no preference for either l or d amino 2058 01:19:56,470 --> 01:19:54,560 acids 2059 01:19:58,310 --> 01:19:56,480 the resulting peptide would likely be 2060 01:20:00,630 --> 01:19:58,320 unstructured but it would still be able 2061 01:20:02,390 --> 01:20:00,640 to perform some simple function kind of 2062 01:20:04,790 --> 01:20:02,400 like the phase separating peptide we saw 2063 01:20:07,110 --> 01:20:04,800 at the beginning of the talk 2064 01:20:09,590 --> 01:20:07,120 over time however this primitive 2065 01:20:12,229 --> 01:20:09,600 rna-based translation machinery 2066 01:20:15,510 --> 01:20:12,239 would eventually develop some chiral 2067 01:20:16,229 --> 01:20:15,520 preference for either d or l amino acids 2068 01:20:19,709 --> 01:20:16,239 if 2069 01:20:22,390 --> 01:20:19,719 a community of these d and l preferring 2070 01:20:27,270 --> 01:20:22,400 proto-ribosomes existed along with 2071 01:20:28,709 --> 01:20:27,280 ribozyme aminoacyl trna synthetases 2072 01:20:29,750 --> 01:20:28,719 any gene 2073 01:20:32,629 --> 01:20:29,760 that could 2074 01:20:34,550 --> 01:20:32,639 operate in either chirality 2075 01:20:35,669 --> 01:20:34,560 would have an advantage in that 2076 01:20:37,590 --> 01:20:35,679 community 2077 01:20:39,750 --> 01:20:37,600 in other words an ancient preference for 2078 01:20:42,070 --> 01:20:39,760 ambidextrous protein domains could be 2079 01:20:45,590 --> 01:20:42,080 the result of a competition between a 2080 01:20:47,830 --> 01:20:45,600 complex community of early life that had 2081 01:20:50,310 --> 01:20:47,840 different amino acid preferences but 2082 01:20:52,149 --> 01:20:50,320 were sharing genes in any gene that 2083 01:20:54,629 --> 01:20:52,159 could have functioned in either chiral 2084 01:20:57,030 --> 01:20:54,639 form would have had a distinct advantage 2085 01:20:59,430 --> 01:20:57,040 in this complex community and it's from 2086 01:21:04,629 --> 01:20:59,440 this that we came up with the idea of an 2087 01:21:08,470 --> 01:21:06,550 and so with that i would like to thank 2088 01:21:10,229 --> 01:21:08,480 you for your attention i would like to 2089 01:21:11,510 --> 01:21:10,239 thank my wonderful collaborators for 2090 01:21:14,070 --> 01:21:11,520 their hard work 2091 01:21:15,910 --> 01:21:14,080 and if this theory sounds too crazy or 2092 01:21:22,410 --> 01:21:15,920 just crazy enough and you'd like to talk 2093 01:21:28,310 --> 01:21:26,629 [Applause] 2094 01:21:29,830 --> 01:21:28,320 thank you liam i'm sorry that we won't 2095 01:21:32,070 --> 01:21:29,840 be able to chat with you more here but 2096 01:21:33,430 --> 01:21:32,080 hopefully some of us will do offline or 2097 01:21:35,669 --> 01:21:33,440 by email 2098 01:21:38,310 --> 01:21:35,679 um and with that i'd like to introduce 2099 01:21:40,950 --> 01:21:38,320 our final presenter who's also coming to 2100 01:21:43,510 --> 01:21:40,960 us remotely from the charles university 2101 01:21:44,229 --> 01:21:43,520 of prague in the czech republic and this 2102 01:21:45,350 --> 01:21:44,239 is 2103 01:21:50,950 --> 01:21:45,360 um 2104 01:21:57,430 --> 01:21:54,070 good evening good evening from israel 2105 01:21:59,189 --> 01:21:57,440 i'm going to present part of the work 2106 01:22:00,149 --> 01:21:59,199 which i did at charles university in 2107 01:22:01,350 --> 01:22:00,159 prague 2108 01:22:04,550 --> 01:22:01,360 and 2109 01:22:08,070 --> 01:22:04,560 now i'm residing at weitzman institute 2110 01:22:10,390 --> 01:22:08,080 so in our lab we were considering this 2111 01:22:11,510 --> 01:22:10,400 peculiar disparity just mentioned by 2112 01:22:15,830 --> 01:22:11,520 claudia 2113 01:22:18,390 --> 01:22:15,840 that with only 100 residue protein 2114 01:22:20,070 --> 01:22:18,400 we can construct 20 to 100 possible 2115 01:22:22,709 --> 01:22:20,080 protein sequences 2116 01:22:24,790 --> 01:22:22,719 but approximately only 10 to 15 2117 01:22:25,830 --> 01:22:24,800 different protein sequences are used by 2118 01:22:27,910 --> 01:22:25,840 nature 2119 01:22:28,870 --> 01:22:27,920 so why is that and what is hidden in 2120 01:22:31,510 --> 01:22:28,880 this 2121 01:22:35,189 --> 01:22:31,520 dark protein space was exactly what we 2122 01:22:38,950 --> 01:22:35,199 were interesting interested 2123 01:22:41,510 --> 01:22:38,960 so long story short we made in vitro 2124 01:22:42,790 --> 01:22:41,520 random libraries 2125 01:22:45,030 --> 01:22:42,800 and 2126 01:22:47,110 --> 01:22:45,040 for doing so we used two different amino 2127 01:22:49,910 --> 01:22:47,120 acid alphabet full alphabet consisting 2128 01:22:51,830 --> 01:22:49,920 of all 20 amino acids and so-called 2129 01:22:54,790 --> 01:22:51,840 early alphabet 2130 01:22:56,470 --> 01:22:54,800 which used only periodically available 2131 01:23:00,229 --> 01:22:56,480 amino acids 2132 01:23:02,950 --> 01:23:00,239 length 2133 01:23:05,430 --> 01:23:02,960 consisting of these randomized parts and 2134 01:23:08,550 --> 01:23:05,440 we introduced the thrombin 2135 01:23:10,629 --> 01:23:08,560 protease cleavage site in the middle 2136 01:23:12,310 --> 01:23:10,639 so the first essay which we tried was 2137 01:23:14,550 --> 01:23:12,320 the solubility 2138 01:23:16,550 --> 01:23:14,560 section of the library 2139 01:23:19,669 --> 01:23:16,560 so that was assessed simply by 2140 01:23:20,870 --> 01:23:19,679 expression of our randomized billions of 2141 01:23:23,510 --> 01:23:20,880 sequences 2142 01:23:25,590 --> 01:23:23,520 in a cell free expression system and 2143 01:23:28,070 --> 01:23:25,600 western lotting and for solubility we 2144 01:23:31,270 --> 01:23:28,080 just spinned the mixture and took the 2145 01:23:32,870 --> 01:23:31,280 supernatant in supernatant to assess the 2146 01:23:33,669 --> 01:23:32,880 soluble fraction 2147 01:23:35,350 --> 01:23:33,679 so 2148 01:23:39,189 --> 01:23:35,360 upon the expression in three different 2149 01:23:41,350 --> 01:23:39,199 temperatures 25 30 and 37 degrees 2150 01:23:42,629 --> 01:23:41,360 we've seen a monitoring increase in 2151 01:23:44,709 --> 01:23:42,639 expression 2152 01:23:46,830 --> 01:23:44,719 in early and full 2153 01:23:49,590 --> 01:23:46,840 amino acid alphabet libraries as 2154 01:23:51,350 --> 01:23:49,600 expected but 2155 01:23:53,910 --> 01:23:51,360 the solubility of these two libraries 2156 01:23:57,189 --> 01:23:53,920 showed that while early alphabet 2157 01:23:59,590 --> 01:23:57,199 proteins are essentially fully soluble 2158 01:24:02,470 --> 01:23:59,600 in all temperatures the full alphabet 2159 01:24:04,870 --> 01:24:02,480 library is only partially soluble and 2160 01:24:06,390 --> 01:24:04,880 its solubility remains approximately 2161 01:24:08,950 --> 01:24:06,400 constant 2162 01:24:11,750 --> 01:24:08,960 within our temperature range 2163 01:24:15,110 --> 01:24:11,760 so next i tried to add the chaperone dna 2164 01:24:18,709 --> 01:24:15,120 k into the sulfury mixture and again i 2165 01:24:20,790 --> 01:24:18,719 seen no effect in early alphabet library 2166 01:24:23,669 --> 01:24:20,800 so supplementation of chaperone did not 2167 01:24:25,430 --> 01:24:23,679 improve the expression anyhow 2168 01:24:27,750 --> 01:24:25,440 but in the full amino acid alpha 2169 01:24:31,270 --> 01:24:27,760 alphabet library i see small deviation 2170 01:24:33,189 --> 01:24:31,280 however the difference is not large 2171 01:24:35,750 --> 01:24:33,199 the interesting part is that 2172 01:24:38,870 --> 01:24:35,760 the soluble part of the libraries 2173 01:24:40,950 --> 01:24:38,880 of sharper and supplemented libraries 2174 01:24:43,350 --> 01:24:40,960 showed interesting trends that earlier 2175 01:24:47,669 --> 01:24:43,360 amino acid level library remained 2176 01:24:50,870 --> 01:24:47,679 soluble as shown before but the full 2177 01:24:52,790 --> 01:24:50,880 amino acid alphabet library got 2178 01:24:54,790 --> 01:24:52,800 completely solubilized in the presence 2179 01:24:56,709 --> 01:24:54,800 of chaperone which means that chaperone 2180 01:24:58,550 --> 01:24:56,719 can actually act on 2181 01:24:59,830 --> 01:24:58,560 proteins without any evolutionary 2182 01:25:02,629 --> 01:24:59,840 background 2183 01:25:05,350 --> 01:25:02,639 so the next essay after our 2184 01:25:07,830 --> 01:25:05,360 centrifugation solubility essay was the 2185 01:25:09,510 --> 01:25:07,840 proteolysis assay which allowed us to 2186 01:25:11,910 --> 01:25:09,520 separate 2187 01:25:13,189 --> 01:25:11,920 the whole combinatorial library into 2188 01:25:15,669 --> 01:25:13,199 four parts 2189 01:25:17,510 --> 01:25:15,679 the soluble and degradable degradable 2190 01:25:20,149 --> 01:25:17,520 and soluble degradable and degradable 2191 01:25:22,149 --> 01:25:20,159 which corresponds to the more structured 2192 01:25:23,990 --> 01:25:22,159 parts of soluble proteins 2193 01:25:25,750 --> 01:25:24,000 and more 2194 01:25:27,110 --> 01:25:25,760 disordered parts of soluble and 2195 01:25:28,950 --> 01:25:27,120 insoluble 2196 01:25:30,790 --> 01:25:28,960 fraction of the library 2197 01:25:33,590 --> 01:25:30,800 so 2198 01:25:35,669 --> 01:25:33,600 these are the results the this figure is 2199 01:25:38,229 --> 01:25:35,679 quite complicated and i have no chance 2200 01:25:39,750 --> 01:25:38,239 to describe all the juicy details which 2201 01:25:40,709 --> 01:25:39,760 are contained with them 2202 01:25:42,790 --> 01:25:40,719 but 2203 01:25:45,830 --> 01:25:42,800 let's consider only the 2204 01:25:47,430 --> 01:25:45,840 dark blue part of all these 2205 01:25:49,430 --> 01:25:47,440 results of 2206 01:25:52,070 --> 01:25:49,440 full amino acid alphabet and early amino 2207 01:25:54,870 --> 01:25:52,080 acid altered libraries without and with 2208 01:25:56,870 --> 01:25:54,880 chaperones we see that structured 2209 01:25:58,149 --> 01:25:56,880 fraction is prevalent in all four 2210 01:26:01,350 --> 01:25:58,159 conditions 2211 01:26:03,669 --> 01:26:01,360 and upon the addition of chaperones we 2212 01:26:05,669 --> 01:26:03,679 do not see any induction of the 2213 01:26:09,990 --> 01:26:05,679 structure that means that 2214 01:26:15,030 --> 01:26:13,110 coded within its primaries 2215 01:26:17,830 --> 01:26:15,040 so in conclusion 2216 01:26:20,070 --> 01:26:17,840 we think that early alphabet is soluble 2217 01:26:21,990 --> 01:26:20,080 and chaperone independent 2218 01:26:24,310 --> 01:26:22,000 that full alphabet is solubilized by 2219 01:26:27,110 --> 01:26:24,320 chaperones we observed similar compacted 2220 01:26:29,030 --> 01:26:27,120 structure frequency in both libraries uh 2221 01:26:29,750 --> 01:26:29,040 i've seen that chaperones do not promote 2222 01:26:34,229 --> 01:26:29,760 the 2223 01:26:35,910 --> 01:26:34,239 possible structure formation in a 2224 01:26:36,950 --> 01:26:35,920 prebiotically plausible alphabet 2225 01:26:39,430 --> 01:26:36,960 libraries 2226 01:26:41,270 --> 01:26:39,440 and we showed that chaperones do 2227 01:26:42,470 --> 01:26:41,280 positively interact with the random 2228 01:26:44,950 --> 01:26:42,480 sequences 2229 01:26:47,430 --> 01:26:44,960 so with all of that i 2230 01:26:49,510 --> 01:26:47,440 recommend you to look at our paper where 2231 01:26:50,470 --> 01:26:49,520 we describe many other interesting 2232 01:26:53,110 --> 01:26:50,480 things 2233 01:26:56,229 --> 01:26:53,120 uh on how we made shop more homemade 2234 01:26:58,550 --> 01:26:56,239 libraries how the library is 2235 01:27:00,470 --> 01:26:58,560 behaving upon the heat shock different 2236 01:27:01,669 --> 01:27:00,480 protease essay and bioinformatic 2237 01:27:05,030 --> 01:27:01,679 predictions 2238 01:27:07,430 --> 01:27:05,040 and with all that thank you thank clara 2239 01:27:09,180 --> 01:27:07,440 and thank organizers to you of the 2240 01:27:15,350 --> 01:27:09,190 conference 2241 01:27:15,360 --> 01:27:21,110 lovely thank you 2242 01:27:25,990 --> 01:27:24,149 so um let's have maybe a few minutes of 2243 01:27:28,229 --> 01:27:26,000 discussion with all the speakers so 2244 01:27:30,390 --> 01:27:28,239 speakers who are in person um you can 2245 01:27:32,470 --> 01:27:30,400 maybe join us on the panel those who are 2246 01:27:34,070 --> 01:27:32,480 online maybe stay in the room 2247 01:27:36,070 --> 01:27:34,080 if you have a question for any of the 2248 01:27:38,470 --> 01:27:36,080 speakers please uh 2249 01:27:40,229 --> 01:27:38,480 line up behind one of the mics and 2250 01:27:41,830 --> 01:27:40,239 we'll probably spend more time hanging 2251 01:27:56,310 --> 01:27:41,840 out after this because there's nothing 2252 01:28:00,149 --> 01:27:58,390 hello there shelby osborne university of 2253 01:28:03,350 --> 01:28:00,159 arkansas center for planetary and space 2254 01:28:05,669 --> 01:28:03,360 sciences this is a question for dr freud 2255 01:28:06,950 --> 01:28:05,679 is that how you pronounce it oh sorry 2256 01:28:08,950 --> 01:28:06,960 it's free dude 2257 01:28:10,870 --> 01:28:08,960 well i'm from arkansas so we just say 2258 01:28:13,830 --> 01:28:10,880 fried all the time 2259 01:28:18,070 --> 01:28:13,840 so i was just going to ask you e coli 2260 01:28:21,110 --> 01:28:18,080 and yeast have a lot of similar enzymes 2261 01:28:22,629 --> 01:28:21,120 and generally we study those in tandem 2262 01:28:25,590 --> 01:28:22,639 anyways 2263 01:28:28,390 --> 01:28:25,600 what would the approach be if you had a 2264 01:28:30,310 --> 01:28:28,400 protein or a ribonuclease sequence 2265 01:28:32,470 --> 01:28:30,320 and you wanted to know what that 2266 01:28:34,629 --> 01:28:32,480 sequence was like before 2267 01:28:37,110 --> 01:28:34,639 the modern folding but you don't know 2268 01:28:38,310 --> 01:28:37,120 what the original or analogous structure 2269 01:28:40,470 --> 01:28:38,320 was 2270 01:28:41,510 --> 01:28:40,480 yeah cool that's a great question 2271 01:28:42,950 --> 01:28:41,520 um 2272 01:28:45,030 --> 01:28:42,960 so 2273 01:28:46,950 --> 01:28:45,040 the the trends that we see in e coli and 2274 01:28:48,950 --> 01:28:46,960 the trends that we see in yeast are 2275 01:28:50,790 --> 01:28:48,960 basically the same so like whatever is 2276 01:28:52,870 --> 01:28:50,800 more refillable in e coli is also more 2277 01:28:54,950 --> 01:28:52,880 refillable in yeast it's just that in 2278 01:28:57,510 --> 01:28:54,960 any given category the yeast ortholog is 2279 01:28:59,750 --> 01:28:57,520 generally more reflectable on average by 2280 01:29:01,669 --> 01:28:59,760 about 15 to 20 percent and we've 2281 01:29:04,229 --> 01:29:01,679 recently i think come up with a pretty 2282 01:29:06,149 --> 01:29:04,239 um convincing explanation for why that 2283 01:29:07,830 --> 01:29:06,159 is and it can be basically explained in 2284 01:29:09,990 --> 01:29:07,840 terms of the fact that yeast proteins 2285 01:29:12,870 --> 01:29:10,000 are more disordered so the extra 2286 01:29:15,430 --> 01:29:12,880 disorder that tends to punctuate between 2287 01:29:17,430 --> 01:29:15,440 the folded domains and yeast proteins 2288 01:29:18,870 --> 01:29:17,440 seems to make it easier to refold them 2289 01:29:21,270 --> 01:29:18,880 off the ribosome because they're sort of 2290 01:29:23,990 --> 01:29:21,280 less likely to get in each other's way 2291 01:29:26,629 --> 01:29:24,000 whereas the e coli proteins tend to have 2292 01:29:29,110 --> 01:29:26,639 very short if any disordered linkers at 2293 01:29:32,070 --> 01:29:29,120 all and that in our opinion or at least 2294 01:29:32,790 --> 01:29:32,080 our hypothesis is that that destined to 2295 01:29:35,830 --> 01:29:32,800 be 2296 01:29:38,550 --> 01:29:35,840 dependent on translation to fold 2297 01:29:40,629 --> 01:29:38,560 and if we didn't know for example that e 2298 01:29:42,950 --> 01:29:40,639 coli and yeast were correlated how would 2299 01:29:44,470 --> 01:29:42,960 we approach the problem of figuring out 2300 01:29:46,310 --> 01:29:44,480 what the 2301 01:29:48,550 --> 01:29:46,320 previous structure 2302 01:29:51,030 --> 01:29:48,560 and enzymes and 2303 01:29:52,550 --> 01:29:51,040 proteins of yeast would have been if we 2304 01:29:55,030 --> 01:29:52,560 didn't know that e coli existed we mean 2305 01:29:56,629 --> 01:29:55,040 like the ancestral sequences 2306 01:29:57,830 --> 01:29:56,639 like the precursor 2307 01:29:59,590 --> 01:29:57,840 oh i see 2308 01:30:01,830 --> 01:29:59,600 i mean we could do we haven't done it 2309 01:30:03,030 --> 01:30:01,840 yet but a cool experiment to do would be 2310 01:30:05,430 --> 01:30:03,040 to do the sort of ancestral 2311 01:30:07,350 --> 01:30:05,440 reconstruction and ask you know how does 2312 01:30:08,950 --> 01:30:07,360 the property change for 2313 01:30:10,709 --> 01:30:08,960 proteins that are perceived to be more 2314 01:30:12,390 --> 01:30:10,719 ancient but we haven't done that yet 2315 01:30:14,149 --> 01:30:12,400 okay interesting and may i get your 2316 01:30:16,790 --> 01:30:14,159 contact information after the question 2317 01:30:20,629 --> 01:30:16,800 maybe offline just so some of you okay 2318 01:30:26,390 --> 01:30:23,590 josh ariola uc san diego i had a quick 2319 01:30:28,950 --> 01:30:26,400 question for valerio 2320 01:30:30,629 --> 01:30:28,960 um i was wondering if you were able to 2321 01:30:33,350 --> 01:30:30,639 observe any 2322 01:30:36,310 --> 01:30:33,360 protective effect on the rna by the 2323 01:30:37,510 --> 01:30:36,320 peptide or the protein 2324 01:30:39,669 --> 01:30:37,520 um 2325 01:30:41,270 --> 01:30:39,679 protective like effect you mean like 2326 01:30:43,430 --> 01:30:41,280 yeah yeah 2327 01:30:45,750 --> 01:30:43,440 if you had like 2328 01:30:47,669 --> 01:30:45,760 high magnesium and high ph i was 2329 01:30:50,629 --> 01:30:47,679 wondering if you could see 2330 01:30:53,510 --> 01:30:50,639 less rna cleavage like self cleavage 2331 01:30:55,110 --> 01:30:53,520 when you have the peptide present 2332 01:30:57,030 --> 01:30:55,120 no we didn't perform this kind of 2333 01:31:00,310 --> 01:30:57,040 experiment and we perform like a 2334 01:31:02,470 --> 01:31:00,320 hydrolysis by erenesis and proteases 2335 01:31:04,950 --> 01:31:02,480 and that one yeah we perform it so like 2336 01:31:07,910 --> 01:31:04,960 removing uh actually there's this uh by 2337 01:31:11,110 --> 01:31:07,920 ernest so we had like uh escalator adta 2338 01:31:13,990 --> 01:31:11,120 in the in the media and uh we saw that 2339 01:31:17,430 --> 01:31:14,000 when we added edta the complex get 2340 01:31:19,030 --> 01:31:17,440 degradated when instead like the dta 2341 01:31:21,590 --> 01:31:19,040 it's removed from the media so there is 2342 01:31:23,750 --> 01:31:21,600 magnesium the the complex is stable and 2343 01:31:24,470 --> 01:31:23,760 the aeronasis is not able to degradate 2344 01:31:35,830 --> 01:31:24,480 so 2345 01:31:37,990 --> 01:31:35,840 protect the the the the binding from the 2346 01:31:40,790 --> 01:31:38,000 the cleavage by the protein rnase in 2347 01:31:42,390 --> 01:31:40,800 presence of magnesium or not so 2348 01:31:44,629 --> 01:31:42,400 but yeah it's a good experiment like to 2349 01:31:47,110 --> 01:31:44,639 to try also with the higher 2350 01:31:50,229 --> 01:31:47,120 concentration and titration yeah cool 2351 01:31:54,790 --> 01:31:53,030 hi um i'm self son from university of 2352 01:31:55,830 --> 01:31:54,800 arizona and i have a question to steven 2353 01:31:58,310 --> 01:31:55,840 freed 2354 01:32:00,790 --> 01:31:58,320 uh i know this is a long shot but i was 2355 01:32:02,149 --> 01:32:00,800 wondering if there is a software or 2356 01:32:04,149 --> 01:32:02,159 something 2357 01:32:06,149 --> 01:32:04,159 that allows you to calculate 2358 01:32:07,990 --> 01:32:06,159 refoldability as a matrix from the 2359 01:32:10,229 --> 01:32:08,000 sequence just like you calculate this 2360 01:32:11,830 --> 01:32:10,239 order propensity i think it's an amazing 2361 01:32:13,030 --> 01:32:11,840 goal that we would love to be able to do 2362 01:32:15,510 --> 01:32:13,040 and i think that 2363 01:32:17,830 --> 01:32:15,520 the the stage that we're operating at is 2364 01:32:19,669 --> 01:32:17,840 to try to collect features like 2365 01:32:21,510 --> 01:32:19,679 biophysical structural that we can 2366 01:32:23,270 --> 01:32:21,520 associate with it and then 2367 01:32:24,950 --> 01:32:23,280 i think that ultimately as we sort of 2368 01:32:27,030 --> 01:32:24,960 get more and more features and map more 2369 01:32:29,030 --> 01:32:27,040 protiums it shouldn't be too crazy to 2370 01:32:31,350 --> 01:32:29,040 involve some machine learning algorithm 2371 01:32:33,030 --> 01:32:31,360 to assimilate it all together but 2372 01:32:34,550 --> 01:32:33,040 for that i'll maybe ask for your help 2373 01:32:36,550 --> 01:32:34,560 because to me machine learning still 2374 01:32:41,030 --> 01:32:36,560 mystifies me 2375 01:32:44,470 --> 01:32:42,950 so i have two questions one fairly 2376 01:32:46,629 --> 01:32:44,480 specific and more general and the 2377 01:32:48,149 --> 01:32:46,639 specific one is is definitely for i 2378 01:32:50,229 --> 01:32:48,159 guess just for stephen 2379 01:32:51,910 --> 01:32:50,239 and the general one mostly i think 2380 01:32:53,990 --> 01:32:51,920 applies to your talk but could apply to 2381 01:32:55,350 --> 01:32:54,000 others so please chime in if it does so 2382 01:32:56,790 --> 01:32:55,360 the first one is 2383 01:32:58,470 --> 01:32:56,800 um 2384 01:33:01,430 --> 01:32:58,480 for the ribosome you said for the 2385 01:33:03,830 --> 01:33:01,440 ribosome refolding uh 2386 01:33:05,750 --> 01:33:03,840 whatever the results there 2387 01:33:07,910 --> 01:33:05,760 do i understand that the assay for that 2388 01:33:10,070 --> 01:33:07,920 was simply you had an extract and you 2389 01:33:11,510 --> 01:33:10,080 you heat it up to to unfold it and 2390 01:33:14,229 --> 01:33:11,520 re-fold and then you were using that 2391 01:33:15,990 --> 01:33:14,239 protea uh protease assay that you use 2392 01:33:18,629 --> 01:33:16,000 for the other proteins it's all 2393 01:33:20,550 --> 01:33:18,639 was that all the same assay yeah so the 2394 01:33:23,110 --> 01:33:20,560 basic structure of the assays you take 2395 01:33:25,510 --> 01:33:23,120 an entire extract 2396 01:33:27,990 --> 01:33:25,520 add solid guanidinium chloride to it to 2397 01:33:30,390 --> 01:33:28,000 unfold everything in it and then dilute 2398 01:33:32,390 --> 01:33:30,400 it out in order to 2399 01:33:33,270 --> 01:33:32,400 refold things and then you compare that 2400 01:33:37,750 --> 01:33:33,280 to 2401 01:33:40,070 --> 01:33:37,760 the original unfolding but where they're 2402 01:33:42,629 --> 01:33:40,080 otherwise compositionally identical it's 2403 01:33:44,790 --> 01:33:42,639 just simply had different histories and 2404 01:33:47,830 --> 01:33:44,800 then the confirmation of the proteins is 2405 01:33:49,430 --> 01:33:47,840 then probed with the protease so when we 2406 01:33:51,990 --> 01:33:49,440 say that the large subunit seems to be 2407 01:33:54,870 --> 01:33:52,000 refoldable what we really mean is that 2408 01:33:56,950 --> 01:33:54,880 amongst the 36 2409 01:33:59,430 --> 01:33:56,960 large ribosomal proteins for which we 2410 01:34:01,910 --> 01:33:59,440 have data we can't tell any difference 2411 01:34:04,149 --> 01:34:01,920 in the proteolysis profile before versus 2412 01:34:06,149 --> 01:34:04,159 after but it seems to be quite different 2413 01:34:07,910 --> 01:34:06,159 for the small subunit 2414 01:34:10,709 --> 01:34:07,920 and the second more general question is 2415 01:34:12,550 --> 01:34:10,719 that uh for for your assay and for any 2416 01:34:14,870 --> 01:34:12,560 any for most of these other talks as 2417 01:34:16,229 --> 01:34:14,880 well of course membrane proteins are 2418 01:34:17,669 --> 01:34:16,239 very important to biology now or 2419 01:34:19,430 --> 01:34:17,679 probably very important from very early 2420 01:34:21,990 --> 01:34:19,440 on perhaps some simple membrane proteins 2421 01:34:23,910 --> 01:34:22,000 but they kind of represent a 2422 01:34:25,590 --> 01:34:23,920 particularly difficult challenge i think 2423 01:34:27,750 --> 01:34:25,600 for some of these so like in your 2424 01:34:29,350 --> 01:34:27,760 refolding assay presumably you're not 2425 01:34:30,950 --> 01:34:29,360 yeah in a position to look at anything 2426 01:34:33,030 --> 01:34:30,960 but the soluble proteins and in the last 2427 01:34:35,350 --> 01:34:33,040 talk one of the screens was for 2428 01:34:37,350 --> 01:34:35,360 solubility and i think 2429 01:34:38,870 --> 01:34:37,360 maybe sort of implied that 2430 01:34:40,470 --> 01:34:38,880 that it's important to have that 2431 01:34:41,590 --> 01:34:40,480 solubility but 2432 01:34:42,870 --> 01:34:41,600 in fact there are probably a lot of 2433 01:34:44,470 --> 01:34:42,880 early proteins it's very important that 2434 01:34:46,229 --> 01:34:44,480 they not have that property that they 2435 01:34:48,629 --> 01:34:46,239 they punch into a membrane so that's 2436 01:34:50,709 --> 01:34:48,639 that's the more general question i think 2437 01:34:52,550 --> 01:34:50,719 certainly anyone who thinks that they 2438 01:34:54,470 --> 01:34:52,560 might have something relevant please 2439 01:34:55,830 --> 01:34:54,480 chime in but 2440 01:34:59,030 --> 01:34:55,840 slava do you want to comment were you 2441 01:35:02,950 --> 01:35:01,270 i can just say at least briefly for ours 2442 01:35:04,629 --> 01:35:02,960 so yeah you're absolutely right our 2443 01:35:06,709 --> 01:35:04,639 assay has a blind spot to membrane 2444 01:35:08,550 --> 01:35:06,719 proteins because we essentially lice 2445 01:35:10,229 --> 01:35:08,560 without detergent and then they all come 2446 01:35:13,189 --> 01:35:10,239 out and then we do all of our refolding 2447 01:35:14,629 --> 01:35:13,199 on the clarified extract so in essence 2448 01:35:17,350 --> 01:35:14,639 we would love to know more about it but 2449 01:35:21,510 --> 01:35:17,360 we can't say much about it 2450 01:35:24,950 --> 01:35:22,790 hello i'm andrew wheeler from the 2451 01:35:26,950 --> 01:35:24,960 university of arizona i have a question 2452 01:35:28,550 --> 01:35:26,960 for steven freed so 2453 01:35:31,270 --> 01:35:28,560 when you're looking at these domains 2454 01:35:33,030 --> 01:35:31,280 that have different abilities to refold 2455 01:35:35,750 --> 01:35:33,040 uh you mentioned acidity and the 2456 01:35:38,310 --> 01:35:35,760 complexity of these domains but um 2457 01:35:39,990 --> 01:35:38,320 did you also look at any other sort of 2458 01:35:41,590 --> 01:35:40,000 features of the sequence for considering 2459 01:35:42,790 --> 01:35:41,600 what might be driving those differences 2460 01:35:44,870 --> 01:35:42,800 and how well they can reflect maybe 2461 01:35:46,229 --> 01:35:44,880 repeat that with your math tip down 2462 01:35:50,229 --> 01:35:46,239 sorry 2463 01:35:51,430 --> 01:35:50,239 yeah so uh when you're looking at these 2464 01:35:54,470 --> 01:35:51,440 different domains with different 2465 01:35:56,390 --> 01:35:54,480 abilities to refold you can talked about 2466 01:35:58,149 --> 01:35:56,400 acidity and the complexity of them but 2467 01:35:59,910 --> 01:35:58,159 have you considered any other features 2468 01:36:01,830 --> 01:35:59,920 of these sequences that might be driving 2469 01:36:05,109 --> 01:36:01,840 their ability to refold 2470 01:36:07,830 --> 01:36:05,119 so at a very gross level the sequence 2471 01:36:10,709 --> 01:36:07,840 will be reflected in those sort of 2472 01:36:13,430 --> 01:36:10,719 ecod fold groups just because 2473 01:36:15,590 --> 01:36:13,440 as sort of claudia spoke very elegantly 2474 01:36:18,629 --> 01:36:15,600 about the we can sort of use hidden 2475 01:36:20,470 --> 01:36:18,639 markov models in order to group proteins 2476 01:36:22,550 --> 01:36:20,480 together to these sort of lineages that 2477 01:36:24,709 --> 01:36:22,560 will of course have some sequence 2478 01:36:26,790 --> 01:36:24,719 conservation so in that sense when we 2479 01:36:28,790 --> 01:36:26,800 say that ob folds 2480 01:36:30,709 --> 01:36:28,800 always refold we are saying something 2481 01:36:32,709 --> 01:36:30,719 about that you know sort of neighborhood 2482 01:36:34,709 --> 01:36:32,719 of sequence compositions that have that 2483 01:36:36,790 --> 01:36:34,719 property but in terms of like whether or 2484 01:36:38,629 --> 01:36:36,800 not like kind of like a bag of letters 2485 01:36:40,870 --> 01:36:38,639 type of you know analysis of other 2486 01:36:42,070 --> 01:36:40,880 certain amino acids that correlates and 2487 01:36:44,149 --> 01:36:42,080 we haven't done that that would be a 2488 01:36:47,109 --> 01:36:44,159 good thing to do 2489 01:36:51,109 --> 01:36:50,310 hi i'm jason greenwald from etheric uh 2490 01:36:52,470 --> 01:36:51,119 so 2491 01:36:54,390 --> 01:36:52,480 i have to first preface this with saying 2492 01:36:55,430 --> 01:36:54,400 i'm a bit biased because i'm 2493 01:36:56,470 --> 01:36:55,440 very 2494 01:36:58,870 --> 01:36:56,480 much 2495 01:37:00,950 --> 01:36:58,880 in favor of not in any like 2496 01:37:02,390 --> 01:37:00,960 so i really believe it's true but it's 2497 01:37:04,390 --> 01:37:02,400 what i study is 2498 01:37:06,950 --> 01:37:04,400 amyloid peptide aggregation in the 2499 01:37:09,189 --> 01:37:06,960 origin of life and so i have this 2500 01:37:11,430 --> 01:37:09,199 thought that perhaps early 2501 01:37:14,310 --> 01:37:11,440 proteins came out of amyloid structures 2502 01:37:16,470 --> 01:37:14,320 and took um i think it was joanna who 2503 01:37:18,229 --> 01:37:16,480 was making comments about um 2504 01:37:21,830 --> 01:37:18,239 stretches of hydrophobic 2505 01:37:23,669 --> 01:37:21,840 residues being selected against or i'm 2506 01:37:25,510 --> 01:37:23,679 not even sure i remember the detail now 2507 01:37:26,310 --> 01:37:25,520 but i just want to point out that there 2508 01:37:28,229 --> 01:37:26,320 are 2509 01:37:30,950 --> 01:37:28,239 studies and one i remember uh from 2510 01:37:32,390 --> 01:37:30,960 conflict saying that organism complexity 2511 01:37:34,870 --> 01:37:32,400 anti-correlates with the beta 2512 01:37:36,709 --> 01:37:34,880 aggregation propensity of the proteome 2513 01:37:39,750 --> 01:37:36,719 so that being that the more simple 2514 01:37:41,109 --> 01:37:39,760 organisms in principle the older ones uh 2515 01:37:42,950 --> 01:37:41,119 perhaps have more 2516 01:37:45,350 --> 01:37:42,960 propensity in their proteins to be to 2517 01:37:47,990 --> 01:37:45,360 have beta aggregation that sort of fits 2518 01:37:49,590 --> 01:37:48,000 with my not my theory but a theory of 2519 01:37:52,229 --> 01:37:49,600 early proteins coming from 2520 01:37:54,470 --> 01:37:52,239 beta-structured aggregates but also it 2521 01:37:56,550 --> 01:37:54,480 relates to your work stefan i think 2522 01:37:58,550 --> 01:37:56,560 they're saying that if refoldability 2523 01:38:00,950 --> 01:37:58,560 which i think is a super cool idea as a 2524 01:38:03,750 --> 01:38:00,960 potential um 2525 01:38:04,550 --> 01:38:03,760 marker of how old a peptide is a protein 2526 01:38:05,510 --> 01:38:04,560 is 2527 01:38:06,950 --> 01:38:05,520 um 2528 01:38:08,310 --> 01:38:06,960 it might also be 2529 01:38:10,550 --> 01:38:08,320 that there's 2530 01:38:11,669 --> 01:38:10,560 uh some part of the refoldability that 2531 01:38:14,470 --> 01:38:11,679 doesn't necessarily show up in your 2532 01:38:15,990 --> 01:38:14,480 assay because it's aggregation 2533 01:38:18,470 --> 01:38:16,000 that's so sorry that was more like 2534 01:38:20,550 --> 01:38:18,480 blabbing than a question um so i'll make 2535 01:38:22,229 --> 01:38:20,560 one question then and someone else can 2536 01:38:25,750 --> 01:38:22,239 talk if they want to 2537 01:38:29,910 --> 01:38:28,550 yeah you talked about uh that's super 2538 01:38:32,709 --> 01:38:29,920 cool talk by the way i really like that 2539 01:38:34,950 --> 01:38:32,719 kind of work uh where you're replacing 2540 01:38:36,950 --> 01:38:34,960 trying to make primitive looking uh 2541 01:38:38,310 --> 01:38:36,960 protein see um 2542 01:38:40,870 --> 01:38:38,320 but you showed that the fold was 2543 01:38:41,990 --> 01:38:40,880 different i think by the cd right so you 2544 01:38:44,149 --> 01:38:42,000 had something that wasn't folded but 2545 01:38:45,590 --> 01:38:44,159 then you try to model it folded is did i 2546 01:38:47,350 --> 01:38:45,600 miss something or is that sort of a 2547 01:38:48,870 --> 01:38:47,360 little bit out of sync 2548 01:38:50,149 --> 01:38:48,880 with what you expect or do you think it 2549 01:38:52,310 --> 01:38:50,159 may have 2550 01:38:54,550 --> 01:38:52,320 retained some of the same fold 2551 01:38:56,950 --> 01:38:54,560 no actually we were it was kinda we kind 2552 01:38:59,270 --> 01:38:56,960 of expected that the the 2553 01:39:01,510 --> 01:38:59,280 mostly the the 2554 01:39:03,510 --> 01:39:01,520 the variance got like almost 30 percent 2555 01:39:05,350 --> 01:39:03,520 of difference and we can expect it that 2556 01:39:07,189 --> 01:39:05,360 he was supposed to lose the function the 2557 01:39:08,390 --> 01:39:07,199 the structure so 2558 01:39:10,629 --> 01:39:08,400 and uh 2559 01:39:13,189 --> 01:39:10,639 yeah so 2560 01:39:15,109 --> 01:39:13,199 it was kind of possible i can also like 2561 01:39:16,950 --> 01:39:15,119 uh 2562 01:39:19,669 --> 01:39:16,960 it's typical of like this probiotic 2563 01:39:22,310 --> 01:39:19,679 protein of like uh like smoke this 2564 01:39:24,550 --> 01:39:22,320 early alphabet i was also slava show 2565 01:39:26,790 --> 01:39:24,560 before they tend to be more disordered 2566 01:39:28,709 --> 01:39:26,800 so it kind of fit with our 2567 01:39:31,830 --> 01:39:28,719 like with our theory that it's not 2568 01:39:33,270 --> 01:39:31,840 needed no surprise to us like 2569 01:39:36,149 --> 01:39:33,280 thanks i know everyone wants to go home 2570 01:39:37,990 --> 01:39:36,159 but um just one quick question for cloud 2571 01:39:39,270 --> 01:39:38,000 sorry i get the names right here 2572 01:39:41,270 --> 01:39:39,280 claudia 2573 01:39:43,910 --> 01:39:41,280 right you talked about the order i i 2574 01:39:45,990 --> 01:39:43,920 like that concept too of uh how pep 2575 01:39:47,590 --> 01:39:46,000 proteins are evolving their structures 2576 01:39:50,310 --> 01:39:47,600 but can you tell the direction it's 2577 01:39:52,870 --> 01:39:50,320 going i i um geometry kind of lost me 2578 01:39:55,510 --> 01:39:52,880 there can you say it's going from sh3 to 2579 01:39:58,709 --> 01:39:55,520 cradle not reverse 2580 01:40:01,189 --> 01:39:58,719 yeah with these patterns we can have 2581 01:40:03,189 --> 01:40:01,199 some idea of the 2582 01:40:05,910 --> 01:40:03,199 which one is the ancestral fold and 2583 01:40:08,070 --> 01:40:05,920 which one is the daughter fault but not 2584 01:40:09,270 --> 01:40:08,080 all the time so in circular permutation 2585 01:40:15,830 --> 01:40:09,280 it's hard 2586 01:40:19,510 --> 01:40:17,830 i think we're a little bit overdue so 2587 01:40:23,260 --> 01:40:19,520 please join me in thanking all of our 2588 01:40:28,149 --> 01:40:25,830 [Applause] 2589 01:40:31,669 --> 01:40:28,159 and hopefully there's a chance to chat